mirror of
https://github.com/samsonjs/samhuri.net.git
synced 2026-03-25 09:05:47 +00:00
* switched to CORS from JSONP * improved style * separated almost all JavaScript from the HTML * minify & combine JavaScript using closure & cat * fleshed out Makefile
332 lines
9.1 KiB
HTML
332 lines
9.1 KiB
HTML
<!doctype html>
|
|
<meta charset=utf-8>
|
|
<meta name=viewport content=width=device-width>
|
|
<title>Basics of the Mach-O file format :: samhuri.net</title>
|
|
<link rel=stylesheet href=../assets/blog.css>
|
|
<script>
|
|
var _gaq = _gaq || [];
|
|
_gaq.push( ['_setAccount', 'UA-214054-5']
|
|
, ['_trackPageview']
|
|
)
|
|
|
|
;(function() {
|
|
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true
|
|
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'
|
|
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s)
|
|
})()
|
|
</script>
|
|
<script src=http://ajax.googleapis.com/ajax/libs/jquery/1.4.4/jquery.min.js></script>
|
|
<script>
|
|
window.SJS = window.SJS || {}
|
|
SJS.filename = "basics-of-the-mach-o-file-format.html"
|
|
</script>
|
|
<script src=../assets/blog-all.min.js></script>
|
|
<header>
|
|
<h1><a href=index.html>sjs' blog</a></h1>
|
|
</header>
|
|
<nav id=breadcrumbs><a href=../>samhuri.net</a> → <a href=index.html>blog</a></nav>
|
|
<div id=posts>
|
|
|
|
<article>
|
|
<header>
|
|
<h1><a href=basics-of-the-mach-o-file-format.html>Basics of the Mach-O file format</a></h1>
|
|
<time>January 18, 2010</time>
|
|
</header>
|
|
<p><i>This post is part of a series on generating basic x86 Mach-O files
|
|
with Ruby. The
|
|
<a href="working-with-c-style-structs-in-ruby.html">
|
|
first post</a> introduced CStruct, a Ruby class used to serialize
|
|
simple struct-like objects.</i></p>
|
|
|
|
|
|
|
|
|
|
<p>Please note that the best way to learn about Mach-O properly is to
|
|
read Apple's
|
|
<a href="http://developer.apple.com/Mac/library/documentation/DeveloperTools/Conceptual/MachORuntime/Reference/reference.html#//apple_ref/doc/uid/TP40000895-CH248-SW3">
|
|
documentation on Mach-O</a>, which is pretty good combined with the
|
|
comments in /usr/include/mach-o/*.h. These posts will only cover
|
|
the basics necessary to generate a simple object file for linking with
|
|
ld or gcc, and are not meant to be comprehensive.</p>
|
|
|
|
|
|
|
|
|
|
<h2>Mach-O File Format Overview</h2>
|
|
|
|
|
|
|
|
|
|
<p>A Mach-O file consists of 2 main pieces: the <b>header</b> and
|
|
the <b>data</b>. The header is basically a map of the file describing
|
|
what it contains and the position of everything contained in it. The
|
|
data comes directly after the header and consists of a number of
|
|
binary blobs of data, one after the other.</p>
|
|
|
|
|
|
|
|
|
|
<p>The header contains 3 types of records: the <b>Mach header</b>,
|
|
<b>segments</b>, and <b>sections</b>. Each binary blob is described
|
|
by a named section in the header. Sections are grouped into one or
|
|
more named segments. The Mach header is just one part of the header
|
|
and should not be confused with the entire header. It contains
|
|
information about the file as a whole, and specifies the number of
|
|
segments as well.</p>
|
|
|
|
|
|
|
|
|
|
<p>Take a quick look at <b>Figure 1</b> in
|
|
<a href="http://developer.apple.com/Mac/library/documentation/DeveloperTools/Conceptual/MachORuntime/Reference/reference.html#//apple_ref/doc/uid/TP40000895-CH248-SW3">
|
|
Apple's Mach-O overview</a>, which illustrates this quite nicely.</p>
|
|
|
|
|
|
|
|
|
|
<p>A very basic Mach object file consists of a header followed by single
|
|
blob of machine code. That blob could be described by a single
|
|
section named __text, inside a single nameless segment. Here's a
|
|
diagram showing the layout of such a file:</p>
|
|
|
|
|
|
|
|
|
|
<pre>
|
|
|
|
,---------------------------,
|
|
Header | Mach header |
|
|
| Segment 1 |
|
|
| Section 1 (__text) | --,
|
|
|---------------------------| |
|
|
Data | blob | <-'
|
|
'---------------------------'
|
|
</pre>
|
|
|
|
|
|
|
|
|
|
<h2>The Mach Header</h2>
|
|
|
|
|
|
|
|
|
|
<p>The Mach header contains the architecture (cpu type), the type of
|
|
file (object in our case), and the number of segments. There is more
|
|
to it but that's about all we care about. To see exactly what's in a
|
|
Mach header fire up a shell and type <tt>otool -h /bin/zsh</tt> (on a
|
|
Mac).</p>
|
|
|
|
|
|
|
|
|
|
<p>Using
|
|
<a href="working-with-c-style-structs-in-ruby.html">
|
|
CStruct</a> we define the Mach header like so:</p>
|
|
|
|
|
|
|
|
|
|
<script src="http://gist.github.com/280635.js"></script>
|
|
|
|
|
|
|
|
|
|
<h2>Segments</h2>
|
|
|
|
|
|
|
|
|
|
<p>Segments, or <b>segment commands</b>, specify where in memory the
|
|
segment should be loaded by the OS, and the number of bytes to
|
|
allocate for that segment. They also specify which bytes inside the
|
|
file are part of that segment, and how many sections it contains.</p>
|
|
|
|
|
|
|
|
|
|
<p>One benefit to generating an object file rather than an executable is
|
|
that we let the linker worry about some details. One of those details
|
|
is where in memory segments will ultimately end up.</p>
|
|
|
|
|
|
|
|
|
|
<p>Names are optional and can be arbitrary, but the convention is to
|
|
name segments with uppercase letters preceded by two underscores,
|
|
e.g. __DATA or __TEXT </p>
|
|
|
|
|
|
|
|
|
|
<p>The code exposes some more details about segment commands, but should
|
|
be easy enough to follow.</p>
|
|
|
|
|
|
|
|
|
|
<script src="http://gist.github.com/280642.js"></script>
|
|
|
|
|
|
|
|
|
|
<h2>Sections</h2>
|
|
|
|
|
|
|
|
|
|
<p>All sections within a segment are described one after the other
|
|
directly after each segment command. Sections define their name,
|
|
address in memory, size, offset of section data within the file, and
|
|
segment name. The segment name might seem redundant but in the next
|
|
post we'll see why this is useful information to have in the section
|
|
header.</p>
|
|
|
|
|
|
|
|
|
|
<p>Sections can optionally specify a map to addresses within their
|
|
binary blob, called a <b>relocation table</b>. This is used by the
|
|
linker. Since we're letting the linker work out where to place
|
|
everything in memory the addresses inside our machine code will need
|
|
to be updated.</p>
|
|
|
|
|
|
|
|
|
|
<p>By convention segments are named with lowercase letters preceded by
|
|
two underscores, e.g. __bss or __text</p>
|
|
|
|
|
|
|
|
|
|
<p>Finally, the Ruby code describing section structs:</p>
|
|
|
|
|
|
|
|
|
|
<script src="http://gist.github.com/280643.js"></script>
|
|
|
|
|
|
|
|
|
|
<h2>macho.rb</h2>
|
|
|
|
|
|
|
|
|
|
<p>As much of the Mach-O format as we need is defined in
|
|
<a href="http://github.com/samsonjs/compiler/blob/master/asm/macho.rb">
|
|
asm/macho.rb</a>. The Mach header, Segment commands, sections,
|
|
relocation tables, and symbol table structs are all there, with a few
|
|
constants as well.</p>
|
|
|
|
|
|
|
|
|
|
<p>I'll cover symbol tables and relocation tables in my next post.</p>
|
|
|
|
|
|
|
|
|
|
<h2>Looking at real Mach-O files</h2>
|
|
|
|
|
|
|
|
|
|
<p>To see the segments and sections of an object file, run
|
|
<tt>otool -l /usr/lib/crt1.o</tt>. <b>-l</b> is for load commands.
|
|
If you want to see why we stick to generating object files instead of
|
|
executables run <tt>otool -l /bin/zsh</tt>. They are complicated
|
|
beasts.</p>
|
|
|
|
|
|
|
|
|
|
<p>If you want to see the actual data for a section otool provides a
|
|
couple of ways to do this. The first is to use
|
|
<tt>otool -d <segment> <section></tt> for an arbitrary
|
|
section. To see the contents of a well-known section, such as __text
|
|
in the __TEXT segment, use <tt>otool -t /usr/bin/true</tt>. You can
|
|
also disassemble the __text section with
|
|
<tt>otool -tv /usr/bin/true</tt>.</p>
|
|
|
|
|
|
|
|
|
|
<p>You'll get to know otool quite well if you work with Mach-O.</p>
|
|
|
|
|
|
|
|
|
|
<h2>Take a break</h2>
|
|
|
|
|
|
|
|
|
|
<p>That was probably a lot to digest, and to make real sense of it you
|
|
might need to read some of the
|
|
<a href="http://developer.apple.com/Mac/library/documentation/DeveloperTools/Conceptual/MachORuntime/Reference/reference.html#//apple_ref/doc/uid/TP40000895-CH248-SW3">
|
|
official documentation</a>.</p>
|
|
|
|
|
|
|
|
|
|
<p>We're close to being able to describe a minimal Mach object file
|
|
that can be linked, and the resulting binary executed. By the end of
|
|
the next post we'll be there.</p>
|
|
|
|
|
|
|
|
|
|
<p><i>(You can almost do that with what we know now. If you
|
|
create a Mach file with a Mach header (ncmds=1), a single unnamed
|
|
segment (nsects=1), and then a section named __text with a segment
|
|
name of __TEXT, and some x86 machine code as the section data, you
|
|
would almost have a useful Mach object file.)</i></p>
|
|
|
|
|
|
|
|
|
|
<p>Till next time, happy hacking!</p>
|
|
|
|
|
|
</article>
|
|
</div>
|
|
<p class=sep>༄</p>
|
|
<div id=around>
|
|
<a id=prev href=working-with-c-style-structs-in-ruby.html>← Working with C-style structs in Ruby</a>
|
|
<a id=next href=a-preview-of-mach-o-file-generation.html>A preview of Mach-O file generation →</a>
|
|
<br style=clear:both>
|
|
</div>
|
|
<p id=need-js align=center><strong>(discussion requires JavaScript)</strong></p>
|
|
<div class=center id=sd-container><a id=sd href=#comment-stuff>show the discussion</a></div>
|
|
<div id=comment-stuff>
|
|
<div id=comments>
|
|
<img id=discussion-spinner src=../assets/spinner.gif>
|
|
</div>
|
|
<form id=comment-form>
|
|
<input name=post type=hidden value=basics-of-the-mach-o-file-format.html>
|
|
<p><input name=name type=text placeholder=name></p>
|
|
<p><input name=email type=text placeholder=email></p>
|
|
<p><input name=url type=text placeholder=url></p>
|
|
<p><textarea name=body placeholder=thoughts></textarea></p>
|
|
<p align=center><input type=submit value='so there'></p>
|
|
</form>
|
|
</div>
|
|
<script type="text/html" id="comment_tmpl">
|
|
<div class=comment>
|
|
<p>
|
|
<% if (url) { %>
|
|
<a href="<%= url %>"><%= name %></a>
|
|
<% } else { %>
|
|
<%= name %>
|
|
<% } %>
|
|
@ <%= strftime('%F %r', new Date(timestamp)) %>
|
|
</p>
|
|
<blockquote><%= html %></blockquote>
|
|
</div>
|
|
</script>
|
|
<footer>
|
|
<a href=https://twitter.com/_sjs>@_sjs</a>
|
|
</footer>
|