Tag: llvm

Introducing build-bom

TL;DR The build-bom tool makes it easy to obtain LLVM IR for C/C++ programs without requiring any modifications to build systems by using the low-level debugger primitive ptrace. This is something you may want if you build tools to analyze C/C++ programs. A typical use of build-bom looks like: ./configure build-bom generate-bitcode -- make build-bom extract-bitcode /path/to/binary --output binary.bc Background What is LLVM IR? LLVM IR is the intermediate representation of code used by LLV…

whole-program-llvm Improvements

whole-program-llvm is a set of scripts that I started to compile programs and libraries into single LLVM bitcode files. The scripts stand in for a compiler (either clang or gcc) and compile each file in a program or library twice: once as the build system intended and once as LLVM bitcode. Another script, extract-bc can then be run over the final binary to link the LLVM bitcode files into a single large bitcode file. This is very useful for performing whole program analysis (hence the name of…

A Hoopl Experience

Introduction I have read about Generalized Algebraic Data Types (GADTs) before, at least as implemented in GHC. The standard type-safe expression evaluator was interesting, but it never left much of an impression on me. Last week, I ran into them in real code for the first time while I was playing with hoopl, a library for representing control-flow graphs and performing dataflow analysis and graph rewriting. The use of GADTs in the hoopl code was enlightening and now I think I have a reasonabl…

Installing llvm-tools

In my last post I neglected to provide installation instructions. For most systems, it should be fairly straightforward: Ensure that dot, llvm-config, ghc, and cabal are in your PATH. The first is provided by the Haskell Platform. The 2012 releases should work. Additionally, ensure that ~/.cabal/bin is in your PATH, since the binaries will be installed there (and it may need to be in your path during the build process, too). Run the following script: REPOSITORIES="hbgl-experimental…

A Handy LLVM Tool (ViewIRGraph)

I realized that I forgot to mention another repository related to my last post: llvm-tools. As the name suggests, this repository contains some useful tools based on my llvm-analysis library. The most interesting tool for people who aren't me is ViewIRGraph, which makes it easy to visualize several interesting program graphs (anything supported by llvm-analysis). The help output gives a reasonable breakdown: ViewIRGraph - View different graphs for LLVM IR modules in a variety of formats U…

Program Analysis with LLVM in Haskell

Introduction I have had the code on github for quite some time, so it seems like I should say something about my LLVM program analysis tools. The primary repository is llvm-analysis, which provides a Haskell interface for analyzing the LLVM IR. The LLVM IR is a high-level assembly language for a virtual machine with infinite registers. This is a virtual machine as in a piece of hardware that does not exist rather than a JVM-style virtual machine that programs run on. LLVM IR is converted dir…

Unification with unification-fd

I finally feel like a real programming languages person. I just used unification to solve a problem besides type checking a variant of the lambda calculus. I used the excellent unification-fd package; it is a bit light on documentation but the included tests were enough for me to figure out how to use it. I might post something more detailed later on. My problem arose due to the type system rewrite in LLVM 3.0. Prior to this, types were all uniqued: there was one instance of each structurall…