Multithreaded SOAP server using QT and C++

In todays world, one of the common things is to be able to serve SOAP requests.  As an enterprise grows, the need for fast response times an...

Sunday, January 21, 2018

Multithreaded SOAP server using QT and C++

In todays world, one of the common things is to be able to serve SOAP requests.  As an enterprise grows, the need for fast response times and optimizations at every possible place becomes more and more important.  I wrote a generalized SOAP server in C++ using QT Creator and gsoap that connects to a database and runs in multiple threads with prepared statements.  Its a little involved of a process, but I hope the ideas here can help others with similar problems.

Environment: QT Creator, Postgres 9.4, gsoap 2.8 on Ubuntu.  Very little if any is OS specific, other than that its made for *nix.  So, FreeBSD and the like should work just fine too.

By the end of this, you will have a SOAP server that connects to a database, that uses multiple threads which are already started and connected to the database, prepared cursors, and an easy path forward to develop new services on it.  We will also add the ability to retrieve the WSDL from the service by accessing it via http://service/?wsdl . Otherwise known as a GET handler.

Step 1: Create the project in QT

In QT Creator, create a new project, QT Console application.  For this example, we will use gsoap to create C++ classes which are generated from a .h file.  It should place the generated files in a subdirectory named soap.  First off lets create the soap definition file.  In this example, soapDef.h, which will be running in the "beer" namespace and named "beersoap".  Yup, its for a brewing website backend.  Make sure you have network, core, and sql included and add LIBS += -lgsoap++


//gsoap beer service name: beersoap
//gsoap beer service port: http://localhost:7575/
//gsoap beer service namespace: urn:beersoap

/**
 * Simple ping operation to verify operation
 */
int beer__ping(void *,char *&pong);

/**
 * Count user records, used mainly for testing DB connection
 */
int beer__usercount(void *_,int &numUsers);

Secondly, to support WSDL retrieval, we can convert the wsdl file generated by gsoap into a C++ file with the text as a variable which we can reference by an extern.  A little shell script magic will do that for us with the following file: wsdl2cpp.sh

#!/bin/sh

wsdlfile=${1}
cppfile=${2}

echo "const char *wsdlTxt = " > ${cppfile}
cat ${wsdlfile} | sed -e 's:":\\":g' \
                      -e 's:^:":' \
                      -e 's:$:\\n":' >> ${cppfile}
echo ";" >> ${cppfile}


And finally tie it all together inside your .pro file


GSOAPFLAGS=-S -L -c++11 -x -i -d $${_PRO_FILE_PWD_}/soap

gsoap.depends = $${_PRO_FILE_PWD_}/soapDef.h
gsoap.target = $${_PRO_FILE_PWD_}/soap/soapC.cpp
gsoap.commands = \
    soapcpp2 $${GSOAPFLAGS} $${_PRO_FILE_PWD_}/soapDef.h && \
    $${_PRO_FILE_PWD_}/wsdl2cpp.sh \
       $${_PRO_FILE_PWD_}/soap/beersoap.wsdl \
       $${_PRO_FILE_PWD_}/wsdl.cpp
QMAKE_EXTRA_TARGETS += gsoap
PRE_TARGETDEPS += $${_PRO_FILE_PWD_}/soap/soapC.cpp

Now, when we build, including the file wsdl.cpp will create a const char * named wsdlTxt that has the entire WSDL file in it.  In our case with the GSOAPFLAGS we say server side code only, no library generation, use c++11, no xml files, create a C++ class for our methods, and the output directory is source/soap.

Step 2: Create Database

The next step is to create a database to test this out with.  Create a users table, we are just going to have a method to count the records in there.  And another table called prepstmts.  Two fields, a char(32) or similar and a text fields.  Once it connects, we will use it to create a collection of prepared statements.  By far the longest running database operation is connecting and preparing a statement.  We want to offload that to start up routines instead of during processing.  Next, create a configuration file for the program to use.  We will specify the location using QT's command line option parser framework.



# Beersoap config file

dbhost = localhost
dbuser = beergineer
dbpass = 
dbname = beergineer
tcplisten = 7575
threadpool = 5

Ok.. now we have a config file, our makefile will handle the gsoap stuff for us automatically (remember to include all the .cpp and .h files it creates into your project, including wsdl.cpp), now lets do some coding and show how easy QT makes a lot of the mundane C++ tasks, and how relatively simple something as advanced as a thread pool can be written and managed.

Step 3: main.cpp

This is the application entry point.  As any good programmer will tell you, your main function should do initialization, basic sanity checks, then launch the real application.  QT makes things like parsing command line options in a nice, standard Unix like way really easy, especially with C++11.  Here is my main.cpp file:

#include <QCoreApplication>
#include <iostream>
#include <QCommandLineParser>

#include "Config.h"
#include "Server.h"

int main(int argc, char *argv[])
{
   QCoreApplication a(argc, argv);
   QCoreApplication::setApplicationName("beersoapd");
   QCoreApplication::setApplicationVersion("DEV");

   QCommandLineParser parser;
   parser.setApplicationDescription("Beergineer Soap Server");
   parser.addHelpOption();
   parser.addVersionOption();

   parser.addOptions({
      {{"c","configFile"},"Config File Location","configFile"}
   });
   parser.process(a);

   Config *myConfig = new Config(parser.value(configFile));
   if (!myConfig->isValid()) {
      std::cerr << "Config invalid" << std::endl;
      return 1;
   }
   new Server(&a,myConfig);

   return a.exec();
}


Dont worry about the Config and Server objects yet, we will create those in a minute.  What this does, is calling it with a -h option shows the help, -v shows the version, and the config file we created earlier can be specified with either -c or --configFile.  The output is formatted well and all the nastiness of getopt and the old school way is handled for you.  The reason for a config file instead of a plethora of switches is extensibility, keeping commands reasonable in length, and for security.  If you have everything as an option, any user on the machine can see things like passwords, etc.. if they are specified on the command line.  The Config object is really simple and just reads a file, splits any lines that arent blank or start with # by the equals sign and stores it in a QMap, then provides an accessor method to get the values of it.  Heres the header

Config.h:

#ifndef CONFIG_H
#define CONFIG_H

#include <QString>
#include <QMap>

class Config
{
   public:
      Config(QString confFile);
      QString getSetting(QString name);
      bool isValid() { return myValid; }
   private:
      QMap config;
      bool myValid;
};

#endif // CONFIG_H

And Config.cpp
#include "Config.h"
#include <QFile>
#include <iostream>

Config::Config(QString confFile)
{
   config.clear();
   myValid = false;
   QFile f(confFile);
   if (!f.exists()) {
      std::cerr << "File does not exist: " << confFile.toStdString() << std::endl;
      return;
   }
   f.open(QIODevice::ReadOnly);
   while (!f.atEnd())
   {
      QString line = f.readLine();
      if (line.trimmed().length() == 0 || line.startsWith("#")) continue;

      int idx = line.indexOf("=");
      QString key = line.left(idx-1).trimmed();
      QString val = line.right(line.length()-idx-1).trimmed();
      config[key] = val;
   }
   f.close();
   QStringList reqattribs;
   reqattribs << "dbhost" << "dbuser" << "dbpass" << "dbname" << "tcplisten" << "threadpool";
   if (!config.size()) return;
   for (int i =0; i < reqattribs.size(); i++)
   {
      if (config.find(reqattribs.at(i)) == config.end()) {
         std::cerr << "Missing required value: " << reqattribs.at(i).toStdString() << std::endl;
         return;
      }
   }
   myValid = true;
}

QString Config::getSetting(QString name)
{
   if (config.find(name) == config.end()) return "";
   return config[name];
}

As you can see, fairly simple, boiler plate code.  Next we get into the actual networking part of it.  Qt makes this exceptionally easy using the QTcpServer class.  So without further adue, here is the Server.h class definition

#ifndef SERVER_H
#define SERVER_H

#include <QTcpServer>
#include <QVector>

#include "Config.h"
#include "SoapThread.h"

class Server : public QTcpServer
{
      Q_OBJECT

   public:
      Server(QObject *parent,Config *confPtr);

   private:
      Config *myConfig;
      void incomingConnection(qintptr handle);
      QVector<SoapThread *> myThreads;

   signals:
      void newConnection();
};

#endif // SERVER_H

A few items of note here.  Using this class is very easy, you pretty much just tell it which port / address to use and the override of incomingConnection will pass in the new socket descriptor.  The vector of SoapThread objects are all initialized on start up and in the next file I will show how we connect a connection to a thread.  Here is the Server.cpp file:
#include "Server.h"
#include <iostream>
#include "SoapThread.h"

Server::Server(QObject *parent,Config *confPtr)
{
   setParent(parent);
   listen(QHostAddress::Any,confPtr->getSetting("tcplisten").toUShort());
   myConfig = confPtr;
   int maxThreads = confPtr->getSetting("threadpool").toInt();
   for (int i =0; i < maxThreads; i++)
   {
      SoapThread *p = new SoapThread(confPtr);
      myThreads.append(p);
      p->start();
      p->moveToThread(p);
      connect(this,SIGNAL(newConnection()),p,SLOT(serveRequest()));
   }
}

void Server::incomingConnection(qintptr handle)
{
   int maxThreads = myConfig->getSetting("threadpool").toInt();
   int tries = 50000;
   while (tries)
   {
      for (int i =0; i < maxThreads; i++)
      {
         SoapThread *p = myThreads[i];
         if (!p->isBusy())
         {
            p->setPendingDescriptor(handle);
            emit(newConnection());
            return;
         }
      }
      tries--;
      sched_yield();
   }
   ::close(handle);
}

Theres a lot packed in here, this is the heart of the application that does the magic.  First in the constructor, we set up the listening socket, which returns immediately, the actual listening happens when the QCoreApplication enters its event loop in the exec() call.  Then we create the threads, each one is created, then started, then, very important because its not intuitive, move the object to itself.  What happens is the event loop and signal / slot dispatchers will remain in the current thread until you actually move it after the thread is started.  The single signal is connected to each thread.  The thread that has the socket descriptor set will process the request.  In real practice a minimal number of threads will be started and if all of them are busy, another one will be started then used to serve the request, then as they are used less and less and there is a sufficient timeout, the thread will clean itself up.  For this purposes though, this is just an example.

Now for the thread class itself.
#ifndef SOAPTHREAD_H
#define SOAPTHREAD_H

#include "Config.h"
#include <QThread>
#include <QReadWriteLock>
#include <QSqlDatabase>
#include <QSqlQuery>

#include "soap/soapbeersoapService.h"

class SoapThread : public QThread
{
      Q_OBJECT

   public:
      SoapThread(Config *cfgPtr);
      void run();
      bool isBusy();
      void setPendingDescriptor(qintptr);
      QSqlDatabase dbHandle;
      QMap<QString,QSqlQuery> prepStmts;

   public slots:
      void serveRequest();

   private:
      Config *myConfig;
      beersoapService *mySoap;
      bool myBusy;
      QReadWriteLock myLock;
      qintptr myPendingDescriptor;

};

#endif // SOAPTHREAD_H
A little bit to explain here.  First, the code is in a threaded enivronment, so a lock is essential.  Qt makes locking easy, but it must still be done with care.  We also have not only a database handle, which must be unique per thread, but all the values in the prepstmts table (from step 2) will become prepared statements, stored by name in the prepStmts table.

And the implementation:
#include "SoapThread.h"
#include <QObject>
#include <QWriteLocker>
#include <QReadLocker>
#include "Server.h"

int get_handler(struct soap *s);

SoapThread::SoapThread(Config *cfgPtr)
{
   myConfig = cfgPtr;
   myBusy = true;
   myPendingDescriptor = 0;
}

void SoapThread::run()
{
   QWriteLocker lock(&myLock);
   mySoap = new beersoapService();
   mySoap->user = (void *)this;
   mySoap->fget = get_handler;
   dbHandle = QSqlDatabase::addDatabase("QPSQL",
                   QString("%1").arg((quintptr)this,QT_POINTER_SIZE *2,16,QChar('0')));
   dbHandle.setDatabaseName(myConfig->getSetting("dbname"));
   dbHandle.setUserName(myConfig->getSetting("dbuser"));
   QString hostname = myConfig->getSetting("dbhost");
   if (hostname.length() > 0 && hostname != "localhost")
      dbHandle.setHostName(hostname);
   dbHandle.open();
   QSqlQuery pstmtq(dbHandle);
   pstmtq.exec("SELECT * FROM prepstmts");
   prepStmts.clear();
   while (pstmtq.next())
   {
      QSqlQuery p(dbHandle);
      p.prepare(pstmtq.value(1).toString().trimmed());
      prepStmts.insert(pstmtq.value(0).toString().trimmed(),p);
   }
   pstmtq.finish();
   myBusy = false;
   lock.unlock();
   exec();
}

void SoapThread::serveRequest()
{
   if (!myPendingDescriptor || myBusy) return;
   QWriteLocker lock(&myLock);
   myBusy = true;
   mySoap->socket = myPendingDescriptor;
   myPendingDescriptor = 0;
   lock.unlock();

   mySoap->serve();
   myBusy = false;
}

bool SoapThread::isBusy()
{
   QReadLocker lock(&myLock);
   return myBusy;
}

void SoapThread::setPendingDescriptor(qintptr d)
{
   if (myBusy || myPendingDescriptor) return;
   QWriteLocker lock(&myLock);
   myPendingDescriptor = d;
}

Ok, to begin with, the real magic doesnt happen in the constructor.  At that point we are still running in the original thread, however, once start is called in the top object, run() is called here, but in a new thread that this object represents.  In there is where we make the connection to the database, create the prepared statements. and set up everything.  The setPendingDescriptor method sets the descriptor of the incoming request for this thread to execute momentarily.  isBusy returns if it is still executing a request.  All variables read or set from outside the thread need the locking around them, as shown above.  There is also way more error checking / handing that must be done for the database access / prep statements but for the sake of brevity I just connected.  When the Server object has an incoming connection, the serveRequest function actually gets called in every thread.  But since only the first non-busy thread got it set, the other ones just return immediately.  Also in the run method, the gsoap object is created here and the fget field is set to a static function shown next.  This supports non SOAP calls to the server (remember http://address?wsdl).  Yup thats coming up.  We also use the user field to point to the thread object so we have access to the public properties of the prepared statements and connected database handle.

The next and last file is the implementation of the soap methods themselves.
/*
 * Core and standard functions for the beersoap service
 */
#include "soap/beersoap.nsmap"

#include <QString>
#include <QVariant>
#include "soap/soapbeersoapService.h"
#include "SoapThread.h"

extern const char *wsdlTxt;

/**
 * \brief Used to determine soap service operation, simply returns the string "PONG!"
 *
 * \param [in] pong Pointer to return string
 * \return status of soap operation in this case always OK
 */
static const char *pongStr = "PONG!";
int beersoapService::ping(void *_param_1, char *&pong)
{
   (void)_param_1;
   pong = (char *)pongStr;
   return SOAP_OK;
}

/**
 * \brief Handle GET requests
 */
int get_handler(struct soap *s)
{
   if (!s) return SOAP_GET_METHOD;
   QString url = s->path;
   int idx = url.indexOf("?");
   if (idx == -1) {
      // handle generic get request here
      return 404;
   }
   QString queryStr = url.right(url.length()-idx-1);

   // WSDL Request
   if (queryStr == "wsdl")
   {
      s->http_content = "text/xml";
      soap_response(s,SOAP_FILE);
      soap_send_raw(s,wsdlTxt,strlen(wsdlTxt));
      soap_end_send(s);
      return SOAP_OK;
   }

   return 404;
}

/**
 * \brief Count the rows in the users table, demonstrates database access
 */
int beersoapService::usercount(void *_, int &numUsers)
{
   (void)_; // unused
   numUsers = -1;
   SoapThread *sThread = (SoapThread *)user;
   QSqlQuery q = sThread->prepStmts.find("countUsers").value();
   q.exec();
   q.isValid();
   q.isSelect();
   q.first();
   numUsers = q.value(0).toInt();
   q.finish();
   return SOAP_OK;
}

So, after all of that, once you get it to compile and run, you now have gsoap objects running in a multithreaded server. The database objects are already connected and ready to go before running a service function, so as you can see with only a few modifications can be made to run very quickly. Hopefully some of the explanations and code here can help someone

FreeBSD 9.0 and up - How to set up an IPSec VPN in the real world

After a frustrating week trying to figure out how to do this, I finally got it, and rather belabor why I switched to OpenBSD years ago the first time and why I want to come back to FreeBSD, and am still not completely sure I made the right decision I will forego the political talk and move on to the technical details.

Here is my setup.  I work from a remote location. I have a whole /24 subnet here at home that I want all of the computers on it to be able to access multiple subnets at work as if I were in the building.  I used to use VPN client software, aka Nortel, but that was very picky and dropped frequently.  So a couple years ago, we realized the box we had been using also spoke IPSec, so we endeavored to set up a connection between the two.  I got it working great on OpenBSD, but since there was multiple subnets, I would frequently lose one of them and have to "bounce" the whole IPSec thing.  In fact it became so frequent (2-3 times a day) that I wrote a shell script and would just run it when I say that things had hanged.  I got to where I was lucky enough to catch it before a terminal session dropped and I lost a bunch of typing I had done in a vi session.  I heard FreeBSD is better, so I'm willing to try it.  I spent all weekend making the switch and I will let you know about the performance later when I've had a chance to try it myself during a typical workday.

Ok, so just to be clear.  The end result is we are going to connect a subnet at a home network / small office with multiple subnets at a corporate office using IPSec and FreeBSD 9.0.  Inserted that to help Google find it.

Step 1 - Rebuild the kernel with IPSec support.

During install, make sure and select that you want the system source installed as well as the ports tree.

We are going to rebuild a kernel on an x64 (amd64) system and call it ROUTER, you can adjust it to what you need, as if youre reading this that should make perfect sense, ie i386
cd /usr/src/sys/amd64/conf
cp GENERIC ROUTER
vi ROUTER

Inside the file change ident to ROUTER, and add the following lines somewhere

option IPSEC
option IPSEC_NAT_T
device crypto

Maybe take out some stuff you dont need, ie SCSI, RAID cards, whatever.

then
cd ../../..
make buildkernel KERNCONF=ROUTER && make installkernel KERNCONF=ROUTER

Go have a cup of coffee or watch a Star Trek episode, it will be busy for 30 mins or so.  Once its done, reboot and you will be in your new kernel.  In case you have came here after reading the handbook, the option IPSEC_NAT_T is to make an error go away that racoon throws if its not compiled in.  It may not technically be needed, but it takes one lead off the table when trying to troubleshoot.  The error is something like unable to set udp encap.

Step 2 - Install ipsec-tools from ports.

The easiest way, assuming you already have basic networking up: pkg_add -r ipsec-tools

Or cd /usr/ports/security/ipsec-tools then make install if you need special options.  If you are reading this, you are probably fine with the defaults.

Step 3 - Initial networking setup

From here on out I will use something very similar to the network I actually set up.  The first step is to set up one subnet, then we can add the other one(s) after its working.   In case you came here from the handbook, there is good news in case you were thinking of having to acquire all of this information like how the box is set up internally, dont fear. a gif0 tunnel is NOT required.

My setup is as follows, I am behind a regular DSL wireless router so that wireless devices and my TV dont have access to the corporate network, and my IP address is 192.168.0.2 .  My internal subnet that I want to connect with work is 192.168.231.0/24.  My internet facing address is 66.22.33.44.  My network I am trying to connect to is 189.168.157.0/24 and the VPN server is 189.168.156.233.  Later I will want to add the 189.168.158.0/24 network as well.

Another point of interest, unlike OpenBSD, a connection attempt that will use the tunnel must be made before it will start to negotiate a connection. So, here is what I did at first and later migrated to the startup scripts.

route add 189.168.157.0/24 192.168.231.1

Save a file name pingit.sh with the following command in it (assuming you know that a pingable machine on the other side exists at octet 11).  It will send a single ping to the box to start the tunnel, but not wait real long.

ping -c 1 -W 1 189.168.157.11

Save this file for later when we are ready to test...

Step 4 - IPSec tools configuration

I am using a pre-shared key in my setup, so the first file is psk.txt.  It lives in /usr/local/etc/racoon as does the other files.

It has the format: host key one per line, so in my case it would be

189.168.156.233  Mysecretpa33word

Wow, difficult, huh?  I didnt pay attention and just put the key in there without the host and it took me 2 hours to figure it out.  It also took me another hour to figure out that it needs chmod 600 psk.txt.

The next file, although deceptively easy, is picky and doesnt give much in the way of informative error messages when things go wrong.

ipsec.conf or setkey.conf, your choice I guess.

flush;
spdflush;
spdadd 192.168.231.0/24 189.168.157.0/24 any -P out ipsec esp/tunnel/192.168.0.2-189.168.156.233/unique;
spdadd 189.168.157.0/24 192.168.231.0/24 any -P in ipsec esp/tunnel/189.168.156.233-192.168.0.2/unique;

Pretty self explanatory, but one note.  If you have only one subnet to join, use "require" instead of "unique" on the end of the line.

spdadd yoursubnet theirsubnet any -P out ipsec esp/tunnel/youraddr-theiraddr/unique;

and reverse the addresses and use in instead of out.

Now, if when testing you get errors about a send error, check this file.  Its easy to write and deceptively easy and you can have an address transposed.

Now on to the racoon.conf file:

path    pre_shared_key  "/usr/local/etc/racoon/psk.txt"; #location of pre-shared key file
log     debug2; #log verbosity setting: set to 'notify' when testing and debugging is complete

padding # options are not to be changed
{
        maximum_length  20;
        randomize       off;
        strict_check    off;
        exclusive_tail  off;
}

timer   # timing options. change as needed
{
        counter         5;
        interval        20 sec;
        persend         1;
#       natt_keepalive  15 sec;
        phase1          30 sec;
        phase2          15 sec;
}

listen  # address [port] that racoon will listening on
{
        isakmp          192.168.0.2 [500];
}

remote  189.168.156.233
{
        exchange_mode   main;
        my_identifier address 66.22.33.44;
        nat_traversal off;
        initial_contact on;
# Phase 1
        proposal {
                encryption_algorithm    3des;
                hash_algorithm          md5;
                authentication_method   pre_shared_key;
                lifetime time           3600 sec;
                dh_group                modp1024;
        }

}


#phase 2
sainfo  anonymous
{
        lifetime        time    1200 sec;
        encryption_algorithm    3des;
        authentication_algorithm      hmac_md5;
        compression_algorithm   deflate;

}

Pay most attention to the bolded areas above.

Step 4 - Firing it up

Set up 3 terminal sessions.  1 have a constant tcpdump -lnvvi <youroutsideinterface> udp port 500 running.  Another to use the script from step 3. And the last one to run the VPN.

First, only needs to be done once, or if it changes.
setkey -f /usr/local/etc/ipsec.conf

Then to fire it up:
racoon -F

That tells it to run in foreground mode outputting to stdout so you can see it.
Assuming there are no errors and its stays running,   hit the script from step 3. ./pingit.sh

It may take 3-4 times to successfully connect, unless you see a reason why not. But eventually, you will not only see pings come back, but a message like this:

 UPDATE succeeded: ESP/Tunnel 192.168.0.2[500]->189.168.156.233[500] spi=67865083(0x40b89fb)

Here are some common other errors you may see and their solutions:

If you get

Jul  7 08:15:28 maricopacomputer racoon: DEBUG: 1 times of 100 bytes message will be sent to 189.168.156.233[500]
Jul  7 08:15:28 maricopacomputer racoon: DEBUG:  2fed9acf 4dbbc616 00000000 00000000 01100200 00000000 00000064 0d000034 00000001 00000001 00000028 01010001 00000020 01010000 800b0001 800c0e10 80010005 80030001 80020001 80040002 00000014 afcad713 68a1f1c9 6b8696fc 77570100
Jul  7 08:15:28 maricopacomputer racoon: DEBUG: resend phase1 packet 2fed9acf4dbbc616:0000000000000000

And no reply, check that you arent routing traffic to the end host through your internal network.


If you get:

2012-07-07 00:16:02: ERROR: phase1 negotiation failed due to send error. dad1f78e51bb5b7e:0000000000000000
2012-07-07 00:16:02: ERROR: failed to begin ipsec sa negotication.

Check setkey.conf closely, especially that the addresses are not transposed or anything.

Step 5 - Adding other subnets

Now that you have subnet 1 working, lets say you want to add a second subnet?

That is very easy.  First of all, add the new records in ipsec.conf, 2 for each.  Then, notice how I had sainfo anonymous  in racoon.conf?  Use that, it makes it so much easier than specifying individual subnets over again.  In ipsec.conf change require to unique.  Otherwise it will only be able to use one subnet at a time and will drop one and connect the other each time there is traffic for the other one.

Liability:

This set up worked for me.  It may not work for you.  Hopefully it helps other sysadmins out there set up ipsec tunnels in FreeBSD 9.0.  Happy VPN'ing!

Update:

After getting this set up over the weekend, today was a day with it in use for work.  And let me say... nice.  It seems more responsive and even a little quicker than OpenBSD.  It may be a beyatch to set up, but it runs like a champ once it is.  

Qt5 Tic-Tac-Toe

This was a fairly simple tic-tac-toe game.  The logic behind it is that it goes through every available move and sees which square will give the computer the highest chance of winning.  If two or more squares are equal, it just selects one at random.  To build this program, yu will need a C++ compiler and qt version 5 including qmake installed.  The idea between using characters X and O or to do native drawing commands was a tough one, but I finally settled on native drawing to a pixmap that was pretty big and then applied them to the squares.  The squares themselves are buttons, so that made the event system a lot easier to use.

Get source from here: https://github.com/beneschtech/qt5-tictactoe

Sunday, November 26, 2017

ELF Loader

We all use slightly different environments to do OS development, mine is BOCHS on a *nix platform of some kind and creating a final ELF file to load into memory.  ELF has a very adaptable format, and can pack multiple things into one file.  For example, your main kernel up in high memory and a stub for BIOS calls down in low memory, all in one file.  A lot of bootloader examples out there use the PHT (Program Header Table) to determine where to load code at.  I however, use the SHT (Section header table).  That allows you to insert a random binary into the final ELF using objcopy at an alternate address.

For now, I am using the floppy model.  Pretending a final executable is a 1.44M floppy and the bootloader, kernel loader and kernel itself are just appended on to each other on the disk image.  The loader I am about to provide also has the added benefit of loading the file one sector at a time, no matter how large, so that it can take a kernel of arbitrary size and load it.

It sets a generic page table allowing access to the first 16M of memory, loads 16 bit, 32 and 64 bit descriptors in the GDT, giving the final segment as 0x28 when it launches into 64 bit mode.  It also enables the SSE registers, since a lot of the code made by my C++ compiler of choice (clang) has SSE registers in heavy use.  Why not?  Clear 128 bits of memory or more in one instruction vs 64.  An able programmer should be able to adapt this to use on a hard drive and to make it a little more dynamic.  This file is intended to be loaded by the MBR and executed at 0x6000

%include "loaderconstants.inc"
[ORG 0x6000]
[BITS 16]

;; Store boot drive
MOV [bootDrive],DL

;; Read first sector of ELF image and get needed data from it
XOR EAX,EAX
MOV AX,kernelLBAAddr
CALL ReadSector

;; First lets make sure its actually an ELF
MOV EAX,0x464c457f
CMP DWORD EAX,[0x7000]
JNZ badELF
; Make sure its 64 bit little endian
MOV AX,0x0102
CMP WORD AX,[0x7004]
JNZ badELF

;; Enter unreal mode, must be done before using copyData function
; Lets assume we have a computer built after 1997
MOV AX,0x2401
INT 0x15
CLI
LGDT [GDTR]
MOV EAX,CR0
INC EAX
MOV CR0,EAX
MOV BX,0x20
MOV FS,BX
DEC EAX
MOV CR0,EAX
STI

;; Now we should be in 16 bit REAL mode with access to the first 4G of RAM through FS
MOV BX,[0x703C]
MOV EAX,[0x7018]
MOV [krnlEntry],EAX
;; GET SHT Address
;; We use the Section Header instead of the PHT, so that we can have an extra section in a seperate location, ie an
;; x86 Real mode interrupt handler at 0x5000 while the kernel itself resides at 1MB
MOV DWORD ESI,[0x7028]
; Get sizeof(SHT)
MOV AX,[0x703A] ; Size of entry
MUL BX ; num entries


XOR ECX,ECX
MOV CX,AX
MOV EDI,0x8000

;; Copy SHT to 0x8000 using our data read functions
CALL copyData

;; Now that we have our sections in memory, lets go through them one by one and load them
XCHG BX,CX
MOV BX,0x8010
elfLoop:
PUSH CX
MOV EDI,[BX] ;; Destination address
ADD BX,8
MOV ESI,[BX] ;; Offset into file
ADD BX,8
MOV ECX,[BX] ;; Size

;; None of these can be zero
CMP EDI,DWORD 0
JZ .elSkipSection
CMP ESI,DWORD 0
JZ .elSkipSection
CMP ECX,DWORD 0
JZ .elSkipSection
CMP DWORD [stackStart],0
JNZ .itsLoaded
   MOV DWORD [stackStart],EDI
.itsLoaded:
CALL copyData
MOV DWORD [mallocStart],EDI
.elSkipSection:
ADD BX,0x30
POP CX
LOOP elfLoop


;; Create page tables, assume 2MB pages are okay
;; Identity map the first 16MB We can set the rest up inside the kernel, for now
;; we know that we have at LEAST 16M
MOV EDI,0x10000
MOV DWORD [FS:EDI],0x11003
ADD EDI,0x1000
MOV DWORD [FS:EDI],0x12003
ADD EDI,0x1000
MOV DWORD [FS:EDI],0x000083
ADD EDI,0x8
MOV DWORD [FS:EDI],0x200083
ADD EDI,0x8
MOV DWORD [FS:EDI],0x400083;
ADD EDI,0x8
MOV DWORD [FS:EDI],0x600083;
ADD EDI,0x8
MOV DWORD [FS:EDI],0x800083;
ADD EDI,0x8
MOV DWORD [FS:EDI],0xa00083;
ADD EDI,0x8
MOV DWORD [FS:EDI],0xc00083;
ADD EDI,0x8
MOV DWORD [FS:EDI],0xe00083;


;; Get int 15 memory map and store the pmode idt in preperation for bios calls from the kernel
SIDT [0x7000]

; use the INT 0x15, eax= 0xE820 BIOS function to get a memory map
; inputs: es:di -> destination buffer for 24 byte entries
; outputs: bp = entry count, trashes all registers except esi
MOV DI,0x7012
xor ebx, ebx            ; ebx must be 0 to start
xor bp, bp              ; keep an entry count in bp
mov edx, 0x0534D4150    ; Place "SMAP" into edx
mov eax, 0xe820
mov [es:di + 20], dword 1       ; force a valid ACPI 3.X entry
mov ecx, 24             ; ask for 24 bytes
int 0x15
jc short .failed        ; carry set on first call means "unsupported function"
mov edx, 0x0534D4150    ; Some BIOSes apparently trash this register?
cmp eax, edx            ; on success, eax must have been reset to "SMAP"
jne short .failed
test ebx, ebx           ; ebx = 0 implies list is only 1 entry long (worthless)
je short .failed
jmp short .jmpin
.e820lp:
mov eax, 0xe820         ; eax, ecx get trashed on every int 0x15 call
mov [es:di + 20], dword 1       ; force a valid ACPI 3.X entry
mov ecx, 24             ; ask for 24 bytes again
int 0x15
jc short .e820f         ; carry set means "end of list already reached"
mov edx, 0x0534D4150    ; repair potentially trashed register
.jmpin:
jcxz .skipent           ; skip any 0 length entries
cmp cl, 20              ; got a 24 byte ACPI 3.X response?
jbe short .notext
test byte [es:di + 20], 1       ; if so: is the "ignore this data" bit clear?
je short .skipent
.notext:
mov ecx, [es:di + 8]    ; get lower dword of memory region length
or ecx, [es:di + 12]    ; "or" it with upper dword to test for zero
jz .skipent             ; if length qword is 0, skip entry
inc bp                  ; got a good entry: ++count, move to next storage spot
add di, 24
.skipent:
test ebx, ebx           ; if ebx resets to 0, list is complete
jne short .e820lp
.e820f:
mov [0x7010], bp        ; store the entry count
clc                     ; there is "jc" on end of list to this point, so the carry must be cleared
JMP LetsGo
.failed:
stc                     ; "function unsupported" error exit
JMP LetsGo

LetsGo:
;; Lets jump from 16 bit to 32 to 64 then to the kernel
CLI ;; Goodbye interrupts until we are in C++ code
MOV EAX,CR0
INC EAX
MOV CR0,EAX
JMP 0x18:mode32
mode32:
[BITS 32]
MOV AX,0x20
MOV DS,AX
MOV DX,0x3F2 ;; Turn the floppy motor off, its annoying!
MOV AL,0xC
OUT DX,AL
;; Set PAE and PGE bit
MOV EAX, 10100000b
MOV CR4,EAX
MOV EDI,0x10000
MOV CR3,EDI
MOV ECX, 0xC0000080               ; Read from the EFER MSR.
RDMSR
OR EAX, 0x00000500                ; Set the LME bit.
WRMSR

MOV EBX,CR0                      ; Activate long mode -
OR EBX,0x80000001                 ; - by enabling paging and protection simultaneously.
MOV CR0,EBX

;; Now lets set up and activate all of that fancy math coprocessor support
;; SSE Instructions

MOV EAX,CR0
AND AX,0xfffb
OR AX,2
MOV CR0,EAX
MOV EAX,CR4
OR AX,3 << 9
MOV CR4,EAX

JMP 0x28: longMode
longMode:
[BITS 64]
MOV AX,0x30
MOV DS,AX
MOV ES,AX
MOV FS,AX
MOV GS,AX
MOV SS,AX
XOR RSP,RSP
MOV ESP,[stackStart]
MOV QWORD RAX,[krnlEntry]
XOR RDI,RDI
MOV EDI,[mallocStart]
MOV RBP,RSP
CALL RAX

CLI
HLT

[BITS 16]
RET

;; Functions

;; Copies data from ESI bytes into the file to address EDI of size ECX bytes
;; Dynamically loads sectors as needed
copyData:
PUSH EBX
PUSH ESI
PUSH EAX
PUSH EDX
PUSH ECX

;; First get starting sector
XOR EAX,EAX
XOR EDX,EDX
MOV EAX,ESI
MOV EBX,512
DIV EBX
ADD EAX,kernelLBAAddr
CALL ReadSector

;; Copy from first sector
MOV ECX,0x200
SUB ECX,EDX  ;; ecx has rest of sector count
POP EBX      ;; actual requested bytes in ebx
CMP EBX,ECX  ;; Is it less?  Can it all really fit in one sector?
JC .onlyOneNeeded ;; Yup
SUB EBX,ECX
PUSH EBX
JMP .doCopy
.onlyOneNeeded:
XCHG EBX,ECX
PUSH DWORD 0
.doCopy:
MOV ESI,EDX
ADD ESI,0x7000
CALL copyBytes

;; Ok, how much is left?
.cdSectorLoop:
POP ECX
CMP ECX,0
JZ .cdDone ;; No more data?
CMP ECX,0x200
JC .cdLastSector ;; Less than one sector of data left

;; Read a whole sector and transfer up to destination
SUB ECX,0x200
PUSH ECX
INC EAX
CALL ReadSector
MOV ECX,0x200
MOV ESI,0x7000
CALL copyBytes
JMP .cdSectorLoop

.cdLastSector:
INC EAX
CALL ReadSector
MOV ESI,0x7000
CALL copyBytes

.cdDone:
POP EDX
POP EAX
POP ESI
POP EBX
RET

;; Copies bytes from esi to edi
;; We have to do this this way since 16 bit rep movsb will only do 64k of ram, this can access the first 4G
copyBytes:
PUSH AX
.cbLoop:
MOV AL,[FS:ESI]
MOV [FS:EDI],AL
INC ESI
INC EDI
LOOP .cbLoop
POP AX
RET

;; Read a sector with the LBA address in EAX into 0x7000
ReadSector:
PUSHAD
MOV [currSector],EAX
CALL incrementSpinner
MOV DL,[bootDrive]
CMP DL,0x80
JNC .readHDD
; We dont need dword support for a floppy
CALL LBAtoCHS
MOV DL,[bootDrive]
MOV AX,0x201
MOV BX,0x7000
INT 0x13
JC readError
POPAD
RET
.readHDD:
MOV DWORD [HDDReadPacket.sector],EAX
MOV AX,0x4200
MOV SI,HDDReadPacket
INT 0x13
JC readError
POPAD
RET


;; Converts LBA to CHS address for a 1.44 floppy
LBAtoCHS:
;[in AX=LBA Sector]
;[out DX,CX]
XOR CX,CX
XOR DX,DX
DIV WORD [flpSecTrk]
INC DX
MOV CL,DL
XOR DX,DX
DIV WORD [flpHds]
MOV DH,DL
MOV CH,AL
RET

;; Incrememnts the spinner so that the user can see something is happening
incrementSpinner:
PUSH SI
PUSH CX
MOV SI,txtSpinner
XOR CX,CX
MOV CL,[txtSpPos]
INC CL
.incrementSpinner1:
ADD SI,3
LOOP .incrementSpinner1
MOV CL,[txtSpPos]
CALL printString
INC CL
CMP CL,4
JLE .incrementSpinnerOut
MOV CL,0
.incrementSpinnerOut:
MOV [txtSpPos],CL
POP CX
POP SI
RET

printString:
PUSH AX
PUSH BX
PUSH CX
MOV AH,0xe
XOR BX,BX
XOR CX,CX
.printStringLoop:
LODSB
TEST AL,AL
JZ .printStringExit
INT 0x10
JMP .printStringLoop
.printStringExit:
POP CX
POP BX
POP AX
RET

;; Error functions
readError:
MOV SI,readErrorStr
CALL printString
CLI
HLT
readErrorStr db 13,10,13,10,"Disk Read error",0

badELF:
MOV SI,badELFStr
CALL printString
CLI
HLT
badELFStr db 13,10,13,10,"Corrupted ELF Image!",0

;; Data
txtSpinner db 0,0,0,"/",8,0,"-",8,0,"\",8,0,"|",8,0,".",0
txtSpPos db 0
bootDrive db 0
currSector dd 0
flpSecTrk dw 18
flpHds dw 2
krnlEntry dq 0
mallocStart dd 0
stackStart dd 0
HDDReadPacket:
;; Some of these values are static
db 0x10
db 0
dw 1
dw 0x7000
dw 0
.sector dq 0

ALIGN 8
GDT:
dq 0
;; 16 Bit
    dd 0x0000ffff   ;; Code 0x8
    dd 0x00009c00
    dd 0x0000ffff   ;; Data 0x10
    dd 0x00009200
;; 32 Bit Segments
    dd 0x0000ffff   ;; Code 0x18
    dd 0x00cf9c00
    dd 0x0000ffff   ;; Data 0x20
    dd 0x00cf9200
;; 64 bit
    dq 0x002f98000000ffff ; Code 0x28
    dq 0x002f92000000ffff ; Data 0x30

GDTR:
dw (GDTR-GDT)-1
dd GDT

TIMES (512 * (loaderNumSects))-($-$$) DB 90