Openbook: Introduction

For the past few months I’ve been working on a side project I’ve dubbed “openbook”. It was my plan to give up on wordpress and not post anything until I had openbook up and running and I would make the development blog on the running build. Development has been slow going, so I’ll continue using wordpress to document the progress on this project.

Basically, I have a lot of problems with the way web-based tools work. Most of the problems are cultural, not technical. I worry about the implications as we as a society become immune to violations of our privacy, bait and switch tactics, and addiction as a business model. It seems that the majority of internet entrepreneurs have embraced these poor cultural trends. Perhaps I can do something about it.

Read more about my ideas for this project in my working document

No Comments

Generating HTML pages from Latex

While latex is pretty much “not designed” for web content, it is very useful to generate a web-version of a latex document. The purpose of latex is clearly for typesetting layouts on a pre-defined page, but when you want to share the information with others, it’s generally a lot easier for them to go to a webpage then it is to download and open a PDF. In addition, it’s generally easier to view a webpage than a PDF because the content is continuous, and one can scroll around and click hyperlinks in a way that is far more fluid than on a PDF.

Now that MathML and SVG are becoming more supported by web browsers, there is a strong case for sharing mathy documents on the web in addition to paper documents (or PDFs, which are only slightly more readable than paper).

To this end, I’ve been evaluating various different Latex to HTML converters. I’ve tried the following on Linux (Ubuntu):

  1. TTH
  2. LaTeX2HTML
  3. text4ht
  4. LaTeXML

By far my favorite is LaTeXML. It generates crisp, simple pages using MathML and CSS, making it easy to customize the style. It doesn’t support a whole lot of packages that I generally would like to use (like algorithm2e), but then again none of them do. Also, the ArXiV project is working on a branch of LaTeXML so there is promise that it will grow quickly to support a lot of the best packages.

Document Setup

My current approach to generating both PDFs and HTMLs from latex source is to use separate top-level documents for both. The directory structure looks something like this:

    document
     |- document_html.tex
     |- document_pdf.tex
     |- document.tex
     |- preamble_common.tex
     |- preamble_html.tex
     |- preamble_pdf.tex
     \- references.bib

The two versions of document_[output].tex are the top-level files. They look like this:

1
2
3
4
5
6
7
8
%document_html.tex
 
\documentclass[10pt]{article}
\input{preamble_common}
\input{preamble_html} 
\begin{document}
\input{document}
\end{document}

The pdf version is the same but it uses preamble_pdf as an input. Note that in latex you cannot nest \include directives, but you can nest \input directives. Also, \include inserts a page-break so there is no need to use them here. Rather document.tex may \include it’s chapters as tex files or the like.

Makefile

To ease the process of generating the different types, I’m using a makefile.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# The following definitions are the specifics of this project
PDF_OUTPUT  :=  document.pdf
HTML_OUTPUT :=  document.html
 
PDF_MAIN	:=  document_pdf.tex
HTML_MAIN   :=  document_html.tex
 
COMMON_TEX 	:=	document.tex \
                preamble_common.tex
 
PDF_TEX		:=  $(COMMON_SRC) \
                document_pdf.tex \
                preamble_pdf.tex
 
HTML_TEX    :=  $(COMMON_SRC) \
                document_html.tex \
                preamble_html.tex 
 
BIB         :=  references.bib
 
 
 
# these variables are the dependencies for the outputs
PDF_SRC     := $(PDF_TEX) $(BIB)
HTML_SRC    := $(HTML_TEX) $(BIB)
 
# the 'all' target will make both the pdf and html outputs
all: pdf html
 
# the 'pdf' target will make the pdf output
pdf: $(PDF_OUTPUT)
 
# the 'html' target will make the html output
html: $(HTML_OUTPUT)
 
# the pdf output depends on the pdf tex files
# we use a shell script to optionally run pdflatex multiple times until the
# output does not suggest that we rerun latex
$(PDF_OUTPUT): $(PDF_TEX) 
	@echo "Running pdflatex on $(PDF_MAIN)"
	@pdflatex $(basename $(PDF_MAIN)) > $(basename $(PDF_MAIN))_0.log
	@echo "Running bibtex"
	@-bibtex   $(basename $(PDF_MAIN)) > bibtex_pdf.log 
	@echo "Checking for rerun suggestion"
	@for ITER in 1 2 3 4; do \
		STABELIZED=`cat $(basename $(PDF_MAIN)).log | grep "Rerun"`; \
		if [ -z "$$STABELIZED" ]; then \
			echo "Document stabelized after $$ITER iterations"; \
			break; \
		fi; \
		echo "Document not stabelized, rerunning pdflatex"; \
		pdflatex $(basename $(PDF_MAIN)) > $(basename $(PDF_MAIN))_$$ITER.log; \
	done
	@echo "Copying pdf to target file"
	@cp $(basename $(PDF_MAIN)).pdf $(PDF_OUTPUT)
 
# the html output depends on the html tex files
# we have to process all of the bibliography files separately into xml files, 
# and then include them all in the call to the postprocessor
$(HTML_OUTPUT): $(HTML_TEX) 
	@echo "Running latexml on $(HTML_MAIN)"
	@latexml $(HTML_MAIN) --dest=$(basename $(HTML_OUTPUT)).xml > $(basename $(HTML_MAIN)).log 2>&1
	@BIBSTRING=""; \
	for BIBFILE in $(BIB); do \
		echo "Running latexml on $$BIBFILE"; \
		XMLFILE=`basename "$$BIBFILE" .bib`.xml; \
		LOGFILE=`basename "$$BIBFILE" .bib`_html.log; \
	    latexml $$BIBFILE --dest=$$XMLFILE > $$LOGFILE 2>&1; \
	    BIBSTRING="$$BIBSTRING --bibliography=$$XMLFILE"; \
	done; \
	echo $$BIBSTRING > bibstring.txt
	@echo "postprocessing with `cat bibstring.txt`"
	@latexmlpost $(basename $(HTML_OUTPUT)).xml `cat bibstring.txt` --dest=$(HTML_OUTPUT) --css=navbar-left.css
 
# the 2>/dev/null redirects stderr to the null device so that we don't get error
# messages in the console when rm has nothing to remove
clean:
	@-rm -v *.log 2>/dev/null
	@-rm -v *.out 2>/dev/null
	@-rm -v *.aux 2>/dev/null
	@-rm -v *.xml 2>/dev/null
	@-rm -v *.pdf 2>/dev/null
	@-rm -v *.html 2>/dev/null
	@-rm -v bibstring.txt 2>/dev/null

Some notes on the makefile. I execute bibtex ignoring errors (the dash symbol before ‘bibtex’) because bibtex will exit with an error if it doesn’t find any citations, or if there is no bibliography. Each iteration of pdflatex is output to a logfile named “document_pdf_<i>.log” where “<i>” is the iteration number. The output of pdflatex and bibtex is supressed by dumping it to the logfile (I the verbosity useless to have in the console).

The shell script in the PDF recipe iterates up to four times. The first thing it does is greps the output of the most recent run pdf latex looking for the line where latex recommends that we “Rerun” latex. If it finds such a line it sets the shell variable STABELIZED to that string. Otherwise it gets the empty string. Then we test to see if the string is empty. If it’s empty, we’re done so we break the loop. If it’s not, then we rerun pdflatex.

The shell script in the HTML recipe iterates over each of the (potentially multiple, potentially zero) bibliography files, processing each of them with latexml. It then appends the string “–bibliography=<filename>.xml” to the BIBSTRING shell variable. The last thing it does is echos the contents of that shell variable to the file “bibstring.txt”. This so so that subsequent commands by make can find it.

No Comments

Personal Dynamic DNS in Ubuntu

I finally got around to purchasing a personal server and one of the first things I did was set up a private DNS server for cheshirekow.com. As it turns out, setting it up to be dynamic is quite easy. In this post I’ll go through the steps I took to get it up and running.

I wont bother with all the fun stuff about how dynamic DNS works or how to properly configure everything, but instead I’ll just post my configuration files for posterity.

More detailed information on configuring bind can be found in the Ubuntu Server Guide. A good article on nsupdate and dynamic updates to bind can be found on jeff garzik’s linux pages. I found the information I needed on Network manager hooks from sysadmin’s journey

Why Dynamic DNS?

Mostly because I’m lazy. I have a work laptop, a personal desktop, a netbook, an android tablet, and an android phone. I’m constantly scp’ing files from one to another, and I really hate having to write out the ip address specifically all the time. Since I own the domain cheshirekow.com, I figured it would be really slick to be able to address all of my machines as subdomains. For instance, I could label them as “laptop.cheshirekow.com”, “desktop.cheshirekow.com”, “netbook.cheshirekow.com”, “tablet.cheshirekow.com”, and “phone.cheshirekow.com”. If these dns entries are automatically updated when each of these devices connects to a wifi access point using DHCP, then I can even get files from one machine to another without even being physically near them.

named.conf.local

Following the ubuntu guide, I edited /etc/bind/named.conf.local to look like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
//
// Do any local configuration here
//
 
// Consider adding the 1918 zones here, if they are not used in your
// organization
//include "/etc/bind/zones.rfc1918";
 
zone "cheshirekow.com" {
	type master;
	file "/var/lib/bind/db.cheshirekow.com";
	allow-transfer { aaa.bbb.ccc.ddd; };
	allow-update { key "user.cheshirekow.com."; };
};

Note that the file is in /var/lib/bind/db.cheshirekow.com not in /etc/bind/db.cheshirekow.com like a lot of tutorials will tell you. This is because ubuntu prevents bind from writing to files in /etc/bind. You can either change the apparmor profile for bind, or, just do as I do, and put the file where you’re supposed to go in /var/lib/bind/ (there’s a note in the bind apparmor profile about this). Putting it in “/etc/bind” is fine if the dns entries are all static, but if there are dynamic entries then bind will try to create a .jnl file in the same directory as the db.xxx file. Since bind can’t write to /etc/bind we need to put the db file somewhere else.

Also, note that aaa.bbb.ccc.ddd is the ip address of my secondary name server for cheshirekow.com. I’m using afraid.org to host my secondary DNS.

The allow-update line allows the user user@cheshirekow.com to update the dns entries (the dynamic part) as verified by a keypair (generating the keypair comes later). Note that I don’t use the literal “user”.

/var/lib/bind/db.cheshirekow.com

The next thing was to create the db.cheshirekow.com file which looks like this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ORIGIN .
$TTL 604800	; 1 week
cheshirekow.com		IN SOA	ns1.cheshirekow.com. cheshirekow.gmail.com. (
				9          ; serial
				604800     ; refresh (1 week)
				86400      ; retry (1 day)
				2419200    ; expire (4 weeks)
				604800     ; minimum (1 week)
				)
			NS	ns1.cheshirekow.com.
			A	aaa.bbb.ccc.ddd
			AAAA	::1
$ORIGIN cheshirekow.com.
ns1			A	aaa.bbb.ccc.ddd
www			A	eee.fff.ggg.hhh

Note that aaa.bbb.ccc.ddd is the ipaddress of the name server itself and eee.fff.ggg.hhh is the ip address of my web server (where you are currently reading this). Also note that my email address is cheshirekow@gmail.com but is written in this file as cheshirekow.gmail.com..

You can (should?) also set up reverse dns entries for all these things but I did not as the server is actually sitting in a different physical domain. In other words I don’t own a network of ip-addresses so there’s no reason to expect my server to be queried for reverse dns lookups.

Create Keys

The next thing we need to do is setup a key that we can use to do dynamic updates. This can be done on a separate machine from the name server… it doesn’t matter.

user@ns1:~$ mkdir .bind
user@ns1:~$ cd .bind
user@ns1:~$ dnssec-keygen -a HMAC-MD5 -b 512 -n USER user.cheshirekow.com.

Note that “USER” is a literal string, not a placeholder for something that you create. Also note that “user.cheshirekow.com” is the name of this key, and corresponds to the email address “user@cheshirekow.com”.

This command creates a public and private key.

user@ns1:~/.bind$ ls -l
total 8
-rw------- 1 user user 127 2011-06-10 16:51 Kuser.cheshirekow.com.+157+56713.key
-rw------- 1 user user 229 2011-06-10 16:51 Kuser.cheshirekow.com.+157+56713.private

Install Keys

Now we create a file to store these keys. I put them in /etc/bind/keys.local

1
2
3
4
key "user.cheshirekow.com." {
	algorithm HMAC-MD5;
	secret "2345A/bkd7GDcu9orjzblkj2r37ajglk489DLHD/m987addzjDCadsh8 bbIUOY809glkashDEmPj5alIUoiEeA==";
};

Note that this is not a real key, but random gibberish I pounded out on the keyboard. In reality, this key is copied directly from Kuser.cheshirekow.com.+157+56713.key.

I then added this file to named.conf.local so that it looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
// This is the primary configuration file for the BIND DNS server named.
//
// Please read /usr/share/doc/bind9/README.Debian.gz for information on the 
// structure of BIND configuration files in Debian, *BEFORE* you customize 
// this configuration file.
//
// If you are just adding zones, please do that in /etc/bind/named.conf.local
 
include "/etc/bind/named.conf.options";
include "/etc/bind/named.conf.local";
include "/etc/bind/named.conf.default-zones";
include "/etc/bind/keys.local";

Restart bind

That’s it for the bind setup so restart

user@ns1:~$sudo /etc/init.d/bind9 restart

Client Update Script

I then created the following update script in /etc/NetworkManager/dispatcher.d/99updatedns. This script is called as a hook from network manager every time an interface goes up or down. It receives two parameters. The first is the name of the interface (i.e. eth0 or wlan0) and the second is the status (i.e. up or down).

1
2
3
4
5
6
7
8
9
10
11
12
#!/bin/bash
 
INTERFACE=$1
STATUS=$2
DIRECTORY="/home/user/Codes/shell/dyndns"
 
if [ "$STATUS" = "up" ]; then
    IPADDRESS=`ifconfig $INTERFACE | grep inet | grep -v inet6 | cut -d ":" -f 2 | cut -d " " -f 1`
    cp $DIRECTORY/nsupdate_src.txt /tmp/nsupdate.txt
    sed -i "s/IPADDRESS/$IPADDRESS/" /tmp/nsupdate.txt 
    nsupdate -k /home/user/.bind/Kuser.cheshirekow.com.+157+56713.private -v /tmp/nsupdate.txt
fi

Note that this script requires the nsupdate_src.txt which is here:

1
2
3
4
5
6
server ns1.cheshirekow.com
zone cheshirekow.com
update delete netbook.cheshirekow.com. A
update add netbook.cheshirekow.com. 86400 A IPADDRESS
show
send

The script extracts the ip address from the output of ifconfig for the correct interface, copies the file to /tmp/, replaces IPADDRESS with the actual address of the machine, and then calls nsupdate using the private key and the file. This script is saved as /etc/NetworkManager/dispatcher.d/99updatedns, owned by root and flagged executable. Note that this script accesses the key for my specific user, which is fine in my case because my netbook is a single-user machine. If the machine has multiple users, you may want to store the key and text file in /home/root or something.

Result

The result of this process is that netbook.cheshirekow.com always points to the ip address of my netbook, given that it is connected to a wifi access point. Whenever the netbook (re)connects to an access point, the network manager calls the script, and the dns entry on ns1.cheshirekow.com is updated.

(Update) Better Script

I changed the update script a little bit. Since I use a wired connection on my laptop most of the time, I don’t want the ip address for the wireless connection to supercede that of the wired connection if it is active.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#!/bin/bash
 
INTERFACE=$1
STATUS=$2
DIRECTORY="/home/user/Codes/shell/dyndns"
 
echo "network interface change hook:"
echo "----------------------------";
 
#first, check to see if eth0 is up and running
ETH0STR=`ifconfig eth0 | grep inet | grep -v inet6`
if [ -z "$ETH0STR" ]
then
    echo "eth0 has no address (probably is down or disconnected)"
    echo "checking interface $INTERFACE whose changed launched this script"
    if [ "$STATUS" = "up" ]
    then
        IPADDRESS=`ifconfig $INTERFACE | grep inet | grep -v inet6 | cut -d ":" -f 2 | cut -d " " -f 1`
        if [ -z "$IPADDRESS" ]
        then
            echo "$INTERFACE has no address, aborting (str = $IPADDRESS)"
        else
            echo "$INTERFACE has address $IPADDRESS"
            cp $DIRECTORY/nsupdate_src.txt /tmp/nsupdate.txt
            sed -i "s/IPADDRESS/$IPADDRESS/" /tmp/nsupdate.txt 
            nsupdate -k /home/user/.bind/Kuser.cheshirekow.com.+157+56713.private -v /tmp/nsupdate.txt            
        fi
    else
        echo "Status is not 'up', aborting"
    fi
else
    IPADDRESS=`echo $ETH0STR | cut -d ":" -f 2 | cut -d " " -f 1`
    echo "eth0 has address $IPADDRESS, ignoring changed interface $INTERFACE"
    cp $DIRECTORY/nsupdate_src.txt /tmp/nsupdate.txt
    sed -i "s/IPADDRESS/$IPADDRESS/" /tmp/nsupdate.txt 
    nsupdate -k /home/user/.bind/Kuser.cheshirekow.com.+157+56713.private -v /tmp/nsupdate.txt
fi

Edit:

For some reason whenever I update db.cheshirekow.com bind refuses to restart correctly. When I do this update, I have to delete the file /var/lib/bind/db.cheshirekow.com.jnl and restart.

1 Comment