My name is Nick Kemp. I am a computer programming student at Seneca College in Toronto. My main interests include computers, music, art and video games. I hope to get a job as a Programmer so I can travel the world. The main purpose of this blog is to post programs that I have written for class and computer science related stuff. I hope you enjoy it. Have fun :)

30th May 2014

Question

Anonymous asked: Hello, I knew about "Groonga testing phase" entry April 8, so are there any progress since then?

Sorry. I didnt see this message. I couldn’t get it to build on in an arm64 environment. I thought I had a clean build but maybe it was a false positive or something. I also had to transfer my build from my home machine to a server because it wasn’t powerful enough to run a better, more accurate arm64 environment. I’m still working at it but I’ve been very busy. Can I get a list of all the dependencies that I need? I think I’m just missing some dependencies on my new environment and its causing it to not build successfully. I’ll post this message on the mailing list too

8th April 2014

Post

Groonga Testing Phase

Introduction

Last week blogged about the issues that I was having while trying to run Groonga’s test script. My issues pretty much boiled down to getting the right version of ruby installed and not having cutter installed.

Updated Script

I also posted a script that would install everything for me because I had to do it so many times. I have since updated the script to log the progress of the script and to test if commands ran successfully. I also added some dependencies that I would need to the script that I didn’t add before. Here’s the new version:

#!/bin/bash
top_dir=$(pwd)
logfile=$top_dir/logfile.txt

#testing if the logfile exists
if [ -e $logfile ]; then
  echo Deleting old log file
  rm -f $logfile
fi

touch $logfile

echo Installing dependencies for ruby
yum groupinstall “Development Tools” -y
yum install autoconf gdbm-devel ncurses-devel libdb-devel libffi-devel openssl-devel libyaml-devel readline-devel tk-devel procps dtrace git cmake cutter* -y
echo Changing directory to top directory
cd $top_dir

#testing for older version of the tarball
if [ -e ruby-1.9.3-p545.tar.gz ]; then
  echo deleting any old instances of ruby tarball
  rm -f ruby-1.9.3-p545.tar.gz
fi

#testing for older version of the ruby
if [ -e ruby-1.9.3-p545 ]; then
  cd ruby-1.9.3-p545
  echo Removing older version of ruby
  make clean
  echo deleting old repo
  cd $top_dir
  rm -rf ruby-1.9.3-p545
fi

if [ -e /usr/local/bin/ruby ]; then
  echo Deleting /usr/local/bin/ruby
  rm -f /usr/local/bin/ruby
fi

if [ -e /usr/local/bin/gem ]; then
  echo Deleting /usr/local/bin/gem
  rm -f /usr/local/bin/gem
fi

if [ -e /usr/local/bin/erb ]; then
  echo Deleting /usr/local/bin/erb
  rm -f /usr/local/bin/erb
fi

if [ -e /usr/local/bin/rake ]; then
  echo Deleting /usr/local/bin/rake
  rm -f /usr/local/bin/rake
fi

if [ -e /usr/local/bin/rdoc ]; then
  echo Deleting /usr/local/bin/rdoc
  rm -f /usr/local/bin/rdoc
fi

if [ -e /usr/local/bin/testrb ]; then
  echo Deleting /usr/local/bin/testrb
  rm -f /usr/local/bin/testrb
fi

if [ -e /usr/local/bin/ri ]; then
  echo Deleting /usr/local/bin/ri
  rm -f /usr/local/bin/ri
fi

if [ -e /usr/local/bin/irb ]; then
  echo Deleting /usr/local/bin/irb
  rm -f /usr/local/bin/irb
fi

#getting ruby’s tarball
echo Getting ruby's tarball
wget http://ftp.ruby-lang.org/pub/ruby/1.9/ruby-1.9.3-p545.tar.gz

#extracting ruby tarball
echo Extracting ruby's tarball
tar xvzf ruby-1.9.3-p545.tar.gz

echo Entering to source directory
cd ruby-1.9.3-p545

echo Configuring ruby
if ./configure –build=aarch64-unknown-linux; then
  echo successfully configured ruby >> $logfile
else
  echo Couldn't configure ruby. Exiting… >> $logfile
  exit 1
fi

echo Building ruby
if make; then
  echo Successfully built ruby >> $logfile
else
  echo Couldn't build ruby. Exiting… >> $logfile
  exit 1
fi

echo installing ruby
if make install; then
  echo Successfully installed ruby >> $logfile
else
  echo Couldn't install ruby. Exiting… >> $logfile
  exit 1
fi

if [ -e /usr/local/bin/ruby ]; then
  echo ***Sucessful Install of ruby***
else
  echo **Ruby not installed. Exiting***
  exit -1
fi

echo Installing required gem's
/usr/local/bin/gem install yajl-ruby msgpack test-unit test-unit-rr test-unit-notify

echo Installing groonga's dependencies
yum install mecab-devel zlib-devel lzo-devel msgpack-devel zeromq-devel libevent-devel python2-devel php-devel libedit-devel pcre-devel systemd -y

echo changing directories
cd $top_dir

if [ -e groonga ]; then
  echo deleting any old instances of groonga
  cd groonga
  make clean
  cd $top_dir
  rm -rf groonga
fi

if [ -e /usr/local/bin/groonga ]; then
  echo Deleting /usr/local/bin/groonga
  rm -f /usr/local/bin/groonga
fi

if [ -e /usr/local/bin/groonga-benchmark ]; then
  echo Deleting /usr/local/bin/groonga-benchmark
  rm -f /usr/local/bin/groonga-benchmark
fi

if [ -e /usr/local/bin/groonga-suggest-create-dataset ]; then
  echo Deleting /usr/local/bin/groonga-suggest-create-dataset
  rm -f /usr/local/bin/groonga-suggest-create-dataset
fi

if [ -e /usr/local/bin/groonga-suggest-httpd ]; then
  echo Deleting /usr/local/bin/groonga-suggest-httpd
  rm -f /usr/local/bin/groonga-suggest-httpd
fi

if [ -e /usr/local/bin/groonga-suggest-learner ]; then
  echo Deleting /usr/local/bin/groonga-suggest-learner
  rm -f /usr/local/bin/groonga-suggest-learner
fi

#cloning groonga
echo cloning groonga repository
if git clone https://github.com/nrkemper/groonga; then
  echo Successfully cloned groonga >> $logfile
else
  echo Couldn't clone groonga. Exiting… >> $logfile
  exit 1
fi

echo changing directories into groonga's top directory
cd groonga

#running groonga/autogen.sh
echo running autogen.sh
if ./autogen.sh; then
  echo autogen.sh ran successfully >> $logfile
else
  echo autogen.sh failed. Exiting… >> $logfile
  exit 1
fi

#configuring groonga
echo configuring groonga
if ./configure –with-ruby19=/usr/local/bin/ruby –build=aarch64-unknown-linux; then
  echo configured groonga successfully >> $logfile
else
  echo Failed to configure groonga. Exiting… >> $logfile
  exit 1
fi
 
#building groonga
echo building groonga
if make; then
  echo Built groonga successfully >> $logfile
else
  echo Failed to build groonga. Exiting… >> $logfile
  exit 1
fi

#installing groonga
echo installing groonga
if make install; then
  echo Successfully installed groonga >> $logfile
else
  echo Failed to install groonga. Exiting… >> $logfile
  exit 1
fi

if [ -e /usr/local/bin/groonga ]; then
  echo ***Successfully installed groonga***
else
  echo **Groonga did not install successfully**
fi

Progress

I have made a lot of progress on testing. I FINALLY got Groonga correctly configured with Cutter and Ruby and I was able to run the test suite. I figured it would be better to run the test suite on x86 to see if my code broke Groonga in any way. Somehow the code failed every test. It didn’t seem possible so I cloned their repository again but this time I wouldn’t add my code. I rebuilt Groonga and ran the test suite again and it STILL failed every test. I figured that I was just using an unstable release so I used fedpkg to get the version in the fedora repo. The one that they release for production. I built it and STILL it fails ALL the tests. Either I did something wrong when building or they release code that didn’t pass their own test suites. I feel as though both are a possibility. Either way I emailed the community to see what they think of my issue. Hopefully they have a solution or a direction that I should going because I am at a loss. If it doesn’t pass the test on x86 how can I be sure that it passes on ARM?

Conclusion

This are going well but I am at the end of my ropes lol. I have had my code written for a month or so now. My only issue has been this test suite. Once I get my code to pass the test I can submit a patch file to the community and be done with this. In the meantime I think I’m going to mess around with GProf to see how much my code is actually being used. 

Tagged: spo600

6th April 2014

Post

Groonga Progress So Far

It has been a nightmare trying to get this tester running for groonga. I have built and rebuilt groonga several times. The issue was that it was not configuring correctly with ruby. It turns out the issue was that I needed to install ruby1.9.3 and not the one in the fedora package. This means building from source…on qemu…:(. I already built ruby and groonga from source on my local machine but when I went to run the tester it didn’t work again because I forgot to add the –with-ruby19=PATH_TO_RUBY option when configuring groonga so groonga couldn’t find ruby for the test. And when I settled that issue I forgot to install a couple of dependencies for ruby and I got some errors when I tried to run it. And after that my qemu stopped working on my local machine and I couldn’t install the dependencies I needed for ruby. So now I finally got smart and wrote a script that can handle all these tasks for me and I can just calmly walk away and go work on other stuff while it runs on ireland. It’s Nothing too fancy, I just wanted to learn how to write a Bash script. Here’s the script:

#!/bin/bash
echo Installing dependencies for ruby
yum groupinstall “Development Tools” -y
yum install autoconf gdbm-devel ncurses-devel libdb-devel libffi-devel openssl-devel libyaml-devel readline-devel tk-devel procps -y

echo Changing directory to root
cd /root/

#testing for older version of the tarball
if [ -e ruby-1.9.3-p545.tar.gz ]; then
  echo deleting any old instances of ruby tarball
  rm -f ruby-1.9.3-p545.tar.gz
fi

#testing for older version of the ruby
if [ -e ruby-1.9.3-p545 ]; then
  cd ruby-1.9.3-p545
  echo Removing older version of ruby
  make clean
  echo deleting old repo
  cd /root/
  rm -rf ruby-1.9.3-p545
fi

echo Getting ruby's tarball
wget http://ftp.ruby-lang.org/pub/ruby/1.9/ruby-1.9.3-p545.tar.gz

echo Extracting ruby's tarball
tar xvzf ruby-1.9.3-p545.tar.gz

echo Entering to source directory
cd ruby-1.9.3-p545

echo Configuring ruby
./configure –build=aarch64-unknown-linux

echo Building ruby
make

echo installing ruby
make install

echo Installing required gem's
/usr/local/bin/gem install yajl-ruby msgpack test-unit test-unit-rr test-unit-notify

echo Installing groonga's dependencies
yum install mecab-devel zlib-devel lzo-devel msgpack-devel zeromq-devel libevent-devel python2-devel php-devel libedit-devel pcre-devel systemd -y

echo changing directories
cd /root/

if [ -e groonga ]; then
  echo deleting any old instances of groonga
  cd groonga
  make clean
  cd /root/
  rm -rf groonga
fi

echo cloning groonga repository
git clone https://github.com/nrkemper/groonga
echo changing directories into groonga's root
cd groonga
echo running autogen.sh
./autogen.sh
echo configuring groonga
./configure –with-ruby19=/usr/local/bin/ruby –build=aarch64-unknown-linux
echo building groonga
make
echo installing groonga
make install

Tagged: spo600

7th March 2014

Post

Groonga(cont.)

Introduction

Last week I blogged about my progress so far with Groonga. I have made a bit of progress since then that I think is important to say.

First of all I reworked the code that I was working on to truely make the operations atomic. The new atomics are as follows:

Atomic Add

__asm__ __volatile__(“  dmb sy;             ”
                                    “  ldr %0, [%1];      ”
                                    “  dmb ld;              ”
                                    “  add x0, %0, %2;     ”
                                    “  str x0, [%1];       ”
                                    “  dsb st;             ”
                                   : “=&r”®             
                                   : “r”(p), “r”(i)        
                                   :“x0”, “memory”);

Reverse Bit Scan

__asm__ __volatile__ (“clz %0, %1;”
                                     “mov x9, #63;”
                                     “sub %0, x9, %0;”
                                     “dsb sy;”
                                     :“=r”®
                                     :“r”(v):“x9”)

Reverse Bit Scan Including 0

__asm__ __volatile__ (“clz %0, %1;”
                                     “mov x9, #63;”
                                      “sub %0, x9, %0;”
                                     “mov x9, #0;”
                                     “cmp %0, x9;”
                                     “csel %0, %0, x9, ge;”
                                     “dsb sy;”
                                     :“=r”®
                                     :“r”(v)
                                     :“x9”, “cc”)

Atomic Store

__asm__ __volatile__(“    str %1, [%0];   "     
                                    "    dsb st;         "     
                                    :                   
                                    :"r”(p), “r”(v)     
                                    :“memory”)

Differences

The only difference in the code above and the code that I wrote before is that the new code is setting up memory barriers before each instruction and instruction barriers after each instruction. The memory barrier ensures that all memory accesses up to that point have been loaded/stored from/to memory and the instruction barrier ensure that all instructions up to this point have been executed before moving on. These instructions make the whole group of instructions truly atomic unlike before.

Building

I have successfully built Groonga with my new code after a lot of struggle because the build time was over 2 hours and my connection with Ireland kept timing out. However Chris led me to a tool called Screen that lets you create multiple screens in a Linux terminal. The great thing is that if your connection dies while running screen, you can reattach your screen and go back to right where you left off.That was perfect for me and let me build Groonga successfully.

Testing

After I successfully built Groonga I contacted the mailing list of Groonga to find out the best way to test my new code. I wasn’t too sure how to go about testing but apparently they have a script that can test it for me. I would have never of thought of creating a script have other developers use to test my program. Seems simple but I thought testing would have just been left up to the individual developer. It makes perfect sense though now that I think about it.

So that is where I stand now. I have installed the software that is needed to run the tester that is called Cutter and tried to run the script but I get an error saying “Cutter command not found”. The developer who told me about the testing script warned me that something like this could happen because Cutter might not have been ported to ARM64 yet. I have emailed him and told him the error and am awaiting his response to see the best way to proceed. If the worse comes to worse I might just have to manually test it which could take a while depending on the amount of testing that they want done.

Conclusion

All-in-all things are going very well. I have coded the first version of my code and I am about to test it with their tester. If I can get it working it should be smooth sailing from there. If all goes well and the code that I created does what it is supposed to do I should have this done in a couple of days and should be ready to start my second package that I picked up in  replace of CSound. The package is called Ugene and is used in processing bioinformatics(how cool is that?).

Tagged: spo600

3rd March 2014

Post

Setting Up Qemu On Home Machine

I asked Chris how to set up Qemu on our home machines. He told me and asked me to edit the wiki and add the instructions on how to do it and blog the link to everybody. The directions are found here -> http://zenit.senecac.on.ca/wiki/index.php/SPO600_aarch64_QEMU_on_Ireland#Setting_Up_Qemu_On_Home_Machine at the bottom of the page.

Tagged: spo600

26th February 2014

Post

Groonga Coding

Introduction

I have been working on some code for the atomics that Groonga requires. My previous blog post about the status of where I am at can be found here. Groonga requires atomics for a “Fetch and Add”, “Bits scan” and “Reverse Bit Scan”. An explanation of each comes below along with my attmept at writing these instructions atomically.

Fetch And Add

Fetch and add is the atomic equivalent of r=*p; *p+=i. This was my initial attempt at writing inline assembly that preformed a “fetch add” in GCC inline assembly:

  __asm__ __volatile__(“ldxr %0, [%1];”
                       “add %0, %0, %2;”
                       “stxr w0, %0, [%1]”
                       :“=&r”®
                       :“r”(p), “r”(i)
                       :“w0”, “memory” );

This code works perfectly fine but I am not sure that it is working for the reasons that I am thinking it is working, or at best it IS working based on the conditions that it is currently in but if there is more than one process running it could fail. The reason that I think this is because ARM has an addressing mode called “exclusive access”. That’s whats the ldxr(Load Exclusive Register) and stxr(Store Exclusive Register) instructions are doing. From what I understand Exclusive Access is an area of memory that a particular thread has Exclusive Access to. Meaning no other threads can touch it.So why would I use Exclusive Access instructions? Because I initially thought that Exclusive Access meant hat no other instructions would be performed while the data being loaded was retrieved from memory. I need to do more research and find an actually atomic method of a “Fetch Add” but this code could work for now as a template for what I want to accomplish. I wouldn’t put this code in a patch because it could cause the program to crash.

Reverse Bit Scan

Reverse Bit Scans are the opposite of what you think they should do. You think they should look for the first instance of  1 or 0 bit starting from the least significant bit, but in fact they start from the most significant bit(exactly what you probably would think a regular Bit Scan would do). My version of a bit scan written in GCC inline assembly is:

  __asm volatile(“clz %0, %1;”
                          :“=r”(result)
                          :“r”(num));

Pretty simple. The clz instructions counts the leading zero’s therefore this bit scan looks for the first instance of a 1 bit starting at the most significant bit.

Bit Scan

Bit Scans are the reverse of a Reverse Bit Scan. Or the reversal’s, reverse of the reverse of a Reverse Bit Scan if you will. In layman’s terms, they look for the first instance of a 1 bit or 0 bit (depending on the Bit Scan) beginning from the least significant bit. This is my version of a Bit Scan written in GCC inline assembly:

  __asm volatile(“rbit %1, %1;”
                          “clz %0, %1;”
                          :“=r”(result)
                          :“r”(num));

The exact same as the code for a Reverse Bit Scan but the rbit instruction reverses the bits in the register before the clz instruction is applied. There may be an instruction that can do this in one step and if I find it I will replace it, but for now this is good enough.

Tagged: spo600

25th February 2014

Post

SPO Package Progress

Introduction

For the last couple of days I have been working on a couple of open source Linux packages and trying to port them over to the ARM64 architecture as part of my SPO600 course. My last blog entry can be found here. The two packages that I have been working on are CSound and Groonga.

CSound

I consider my attempt at porting CSound a bunch of mistakes and a huge pile of embarrassment. Kind of like a drunken night in college… hardy har har. I began working on CSound by diving stright into the code. I saw the assembly and tried making a straight copy of it for the ARM processor, but when I finally contacted the community they said that the code was just in there because the Microsoft Visual C++ compiler was “notoriously slow”. I should have taken this as there was nothing left to do because there were C work-arounds for the assembly that was in there but no…. I persisted. I went on a mission to determine if the assembly in there was necessary still because I felt that I invensted so much ofmy time that I needed to contribute something. Eventually I gave up and retracted my name from the mailing list and put this in a deep place in memory never to be hought of again(this blog post aside). Mistakes aside, I learned a lot about the community and got more of an idea of how I needed to present myself and what information I needed to collect from the community before I could jump into the code. So in my eyes there is nothing left to contribute to CSound and from the looks of it CSound will compiler on the ARM as is but I stll need to test it. Oddly enoug I built CSound with my ARM64 code in it before I actually built it without it. So in short, I just need to build it and see if it will successfuly build as is, oteherwirse I need to go and talk to the community again. It doesn’t look like I need to though. But because this was an easier project I am going to pick up another packag to replace it.

Groonga

Because CSound went so bad I learned a lot and now I am applying what I learned to my approach to Groonga. First off, I contacted the community BEFORE i started coding. They welcomed me with open arms and realized that there is probably a porting issure there. After I made contact with the community I started looking for the assembly. Someone in the commmunity gave me a build log of somebody else’s build of Groonga in the ARM environment and that led me to where most of the assembly errors would have occurred from. Groonga also uses nginx which is a HTTP web server. In the nginx code there is some assembly for the powerpc, X86_64, AMD etc for atomic operations for fetch and add and compare and set but there is no version for the ARM processor. I felt that this could have been a portability issue so I contacted the creator of nginx and asked him if he had any plans to port it over to ARM. I am still awaiting his reply. But I also JUST realized that there is a community for nginx so I will send them an email to to see if anybody can work on it for me. I find that getting people to do the work for me is so much more fun then painstakingly reading over 150 pages of ARM64 instructions to see if there is an instruction that suites my need for an atomic op.

As for the code that i need to work on, I plan on doing a straight ARM64 port n the beginning as a simple, temporary fix and then working with the community to see if we can include compiler intrinsics along the line so that Groonga doesn’t need to be updated every time there is a new processor.

Conclusion

I need to find a new package to replace CSound, but before that I need to see if CSound will compile on ARM(I suspect it will). I also need to code the atomic ops for ARM64 in Groonga and I need to see if nginx is going to make a port available for ARM64, but I think they have c work-arounds so that shouldn’t be too much of an issue. All-in-all, things are going ok.

 

Tagged: spo600

24th February 2014

Post

SPO Csound Package Progress Part 2

Introduction

Yesterday I posted a blog entry about the progress that I have made on both of my open source packages for SPO600. since then I have made significant advances in getting at least one of them done.


Csound

I have been in contact with the developers of Csound and have managed to continuously annoy them with great success. After some emails back and forth I found out that my initial assumption that the assembly in their package wasn’t for rounding purposes but was because Microsoft’s Visual C++ compiler was “notoriously slow” according to the guy on the forum. He suggested that we just the the assembly out. I quickly took the assembly out and posted the modified version(it was only one file) to the forum because I couldn’t find any documentation about how their development process works. They responded by saying that it was code for csound5 when they are currently developing csound 6. I knew that it was for sound 5 but I just wanted to get approval of what I was doing. That being said I edited the csound 6 version of the file and posted it to the mailing list only to find out that I pulled the wrong version from git and that I need to be on the developers branch. Then they told me the best solution was just to keep the code in there because they need to look more into if MSVC has improved their casting capabilities. I would stick with this and pester them more but it is such a low priority that I feel like I am impeding them by doing their work. I might actually just do the research myself and see if Microsoft’s Visual C++ compiler has improved at all and then talk to them again. All in all: my head hurts.


Groonga

As for the Groonga community I have made contact with them and they welcomed me with open arms. When I told them what I wanted to do they began to run builds to see if it would in fact port over to the ARM64 cpu. The gave me build logs and other things to help me. But that’s about as far as I got with them. There is a lot of work that needs to be done and I’m probably going to try and get people from the community to help me if possible. I’m thinking about using gcc atomic intrinsics to do most of the stuff that they are trying to do because why keep rewriting assembly when a new processor comes out?

Tagged: spo600

23rd February 2014

Post

SPO CSound Package Progress

Introduction

In my SPO600 class we are working with a company called Linaro who is porting open source packages to the ARM architecture. We were to choose two packages from a list of 60 or so packages that was weeded down from an initial list of 1400 or so. All of these packages contained some assembly that wouldn’t be portable to the ARM architecture. My two packages were CSound and Groonga. CSound is “a unit generator-based, user-programmable computer music system.”(http://www.csounds.com/manual/html/Introduction.html). Groonga is “an open-source fulltext search engine and column store. It lets you write high-performance applications that requires fulltext search”(groonga.org).

Progress So Far

As of right now I have only been working on my CSound package. I started out by diving right into the code. I looked for the assembly in the code and tried to figure out what it was doing. There were only three instances of assembly in the code so it made it really easy. The first instance was getting the clock ticks from the CPU’s Time Stamp Counter register. The other two instance were involved with converting floats and doubles to integers. The code that they wrote was only for Microsoft’s Compiler so I took the inline assembly that they wrote and transferred it to GCC’s inline assembly syntax. Then I ran a bunch of tests to see what the difference between the assembly version of converting floats and doubles to integers and straight casting. I found that when using assembly the precision that you get when round is better. Initially I thought that it was going to be faster to I timed how long it took to convert 1,000,000,000 or so doubles and floats to integers using both casting and assembly. I didn’t find much of a difference.

After I figured out what it was doing I looked at the ARM processor instruction manual to see if there was an equivalent for the ARM that I could do. I found out there was an wrote the code for it. I then tested it to see if it worked in the ARM environment on Ireland and it worked.

I then went about incorporating my new code into their build of CSound and tried building it on X86_64 and ARM64 and both of the builds went perfectly well. On ARM64 it took FOREVER because of how slow it is and all the dependencies that I needed to install but eventually I got a clean build.

As for the CSound community I made the mistake of working on the code before asking them about it. I just emailed the community and got an angryish response when I told them who I was and what I wanted to do. I felt I was plite and respectful but maybe I was wrong. I am currently in the works of trying to establish a relationship with them but I do have a patch for them that is ready to go and has been successfully built on ARM and x86_64.

Hopefully the Groonga community is more welcoming

Tagged: spo600

28th January 2014

Post

Lab 3 - Simple Loop With Assembly

Introduction

  In our lab class this week we finally began coding in assembly. We had to write was to write a simple loop in x86_64 and aarch64 assembly that would print “Loop: X” where X was the iteration that the loop was in. Part two was to extend that program to write 30 iterations of “Loop: XX” where XX would be the iteration of the loop. Part 3 was to further extend part two to remove the leading zero of the number being printed. The code for these programs can be found at the following links:

Part 1:

   x86_64

   arm64

Part 2:

   x86_64

   arm64

Part 3:

  x86_64

  arm64



General Notes On Assembly

  First and foremost I must say that assembly is official my favorite language to write in. Yes it is completely impractical and insane to write programs with it, but it takes an incredible amount of talent to code with it. There definitely is an art to writing in assembly. Not only do you have to understand the architecture of the CPU, but you must find a way to efficiently manage the CPU’s registers and other resources.
  There are a lot of downsides to assembly however. First of all, remembering the system calls for each individual platform and architecture would prove to be impossible given the amount of different architectures and the amount of different platforms. There’s just too many things to remember. Even with thorough documentation, the amount of time you would spend looking over at your notes to find out how to do a particular system call might not be worth the performance boost that you MIGHT get from using assembly in the first place. That being said, assembly is still awesome.
  One of the most interesting things that I noticed that in our group we actually planned out our program. Well first of all we had to get a grasp of what we were trying to do because I don’t think any one of us knew how to write this program in the beginning. But as for the planning part, we planned what registers we used where, how we would use them, and we tried to minimize the amount of registers being used for efficiency sake. I don’t think there is a real equivalent of this in C for a simple looping program. Maybe a giant application that requires at least 2 people, but not for just a simple program like this.
  The greatest upside to writing in assembly is just how fast and small the programs are. If you somehow manage to write a faster program than the compiler, than there is no equivalent to the speed and size of an assembly program. You are literally telling the CPU what to do. There are no layers between you and the CPU. If you tell the CPU to write at memory 0x230 which holds system critical data, the CPU WILL write to that address. It wasn’t until protected mode came into play that the operating system would kick in and say “I can’t let you do that.” On older operating systems you COULD overwrite the operating system.


  Part 1

  When we began working on the lab we had no idea how to do this. We were all confused on what to do and we wanted to begin working on different parts. Some of us suggested splitting up the registers and setting aside what each of those registers would be used for. Others wanted to use memory to store data and increment. Some wanted to start with printing the byte while others wanted to focus on the printing while others wanted to focus on how we were going to set the byte. All three of us thought we were talking about the same thing when we weren’t even talking about the same parts. In short, it was a big mess. A lot of it was also because there was only one person coding and the others were telling him what to do.
  We eventually ran out of time in the lab and we as a group decided to work on it in the library. I brought along my laptop and decided that I wanted to try writing the code on my own. I felt that it would add some structure and we could get a sense of what to do. I wrote a generic version in x86_64 bit assembly with pseudo code used as a placeholder for parts that I didn’t know how to code. It took about five minutes and I showed both my members of my group and we finally got a sense of structure and direction. Eventually, with some headaches of getting the addressing modes correct for the op codes we got a version that compiled correctly. And low and behold when we ran it and….. it didn’t work. Well it worked partially. The program printed out the loop correctly and ran 10 times and the number of the loop was in the right spot like it should have been but for whatever reason it was all on the same line. We were overwriting the new line character at the end of the message. We thought that our calculation to find the index of where to print the byte was wrong but after some testing it was in the correct spot, but we were still overwriting the new line character at the end. Finally we realized that it was because we were writing the full 64bits to memory instead of a single byte. This simple bug along with a couple of others cost us 3 hours. It was still fun though.
  When we converted the first program to the arm64 it was a simple conversion and changing of the op-codes. That was it. We compiled and ran the program and…. it didn’t work. Nothing printed to the screen. After fiddling with it for another half hour or so and scratching our heads we decided to call it quits and go home. As soon as I got home and looked at the code and compared the teacher’s system write call to ours I noticed that for one of the arguments he was loading the address of msg into x2 i think with the ADR op-code while we were loading it with the MOV op-code. I quickly changed this around, compiled it and ran it and lo and behold it worked perfectly fine without error.


  Part 2

  We did part 2 individually. It was a lot simpler to write because I had gotten a grasp of some of the techniques in assembly from the previous part. I wrote it from scratch for practice and I felt I could make it more efficient if I did it this way instead of cherry picking code from the previous program.
  I started out the same way by setting up the message, structure of the loop and adding pseudo code to map out where I wanted things to go. I knew I needed to divide the index by 10 to get the 10’s columns and the remainder would be the ones column. I also knew that I needed to convert to ASCII, store the byte in memory and print out the string so I pseudo coded all these sections in. The main issue was that in the divide function for the x86_64 the %rdx register needed to be set to zero in order for it to divide properly. Initially I didn’t do this and I was getting all sorts of weird results. It would print out Loop: 00, 01 to 05 but then it would skip to 10, 11, 12 etc. This threw me off because it was ending the loop at 45 when there was only 30 elements there so I actually thought that my conversions were wrong. It wasn’t until that I reread the specification for divide that I realized that EDX needed to be set to 0. As soon as I did that everything worked perfectly fine.


 Part 3

  Part 3 took probably five minutes to do. It only required putting a jump instruction in so that if the 10’s column to be printed was a 0, it would put a space there instead. That was it.


 Notes on the Different Architectures and Debugging
 

  X86_64

  - simpler instructions
  - required more instructions to do a complete operation
  - the weird register names such as rax, rbx and r12 made it somewhat easier to organize how you were going to use those registers


  ARM64

  - more complicated instructions made it easier to do things.
  - you could specify where to store the result of an instruction
  - branching is weird. Why did they call it that?
  - easier to write programs for because the registers were more organized   


  Debugging

  - Awful. Hate it. Don’t like GDB
  - debugging is very difficult
  - took us forever to find a simple bug
  - no “intellisense”

Tagged: spo600