Nerding Out

For the last few weeks, I have been playing a lot of Star Wars Galaxies. Yes, the game that was murdered by Sony Online Entertainment in 2011. It was one of the greatest games to have ever been created and you can’t tell me otherwise.

So how am I playing it? On an emulated server, of course.

The Problem

There’s a lot of looting. A lot. As a programmer, I’m obligated to attempt to automate any task that I have to perform more than twice per day.

In SWG, there’s a lot of looking up loot tables to see if the random statistics on a piece of equipment make it worth keeping. I don’t have the space in my head to remember every single acceptable range for dozens of statistics, and looking things up manually in a spreadsheet offends me. I’m not a small business owner, I know a spreadsheet isn’t a database.

Naive Approach

The community has created these lovely big loot tables of what’s good and what’s not, and I intend to rip off incorporate their hard work for my own convenience. Now I’m thinking like a big business owner.

The dream state would be to have the text of an item change colour based on its statistics. This would require me to write an addon for SWG. However, SWG doesn’t have addons, from what I understand.

I would have to write something that I would inject into the client side code, probably getting ejected from the game by the server in the process.

OCR?

I intend to attempt to write a simple node.js program that will take a screenshot of my screen and perform OCR on an item.

If the OCR can read the item’s statistics, I’ll just do some regex, perform a lookup and dump a message to the console.

OCR!

What is OCR? Optical Character Recognition. Humans wrote a lot of books before there were any decent word processors. Obviously we need to digitise all of that knowledge and well, let’s just say that data entry is a fairly mind numbing job.

During my OCR research I encountered the tesseract project, which is an open source OCR engine, maintained by google.

The Program.

Alright, let’s write the program.

First, I had to install tesseract on my machine. Easy enough, there’s a windows installer.

As we’re writing a node program, we’d better find a library to give us access to tesseract. I chose node-tesseract as the library for the job. (I’ve since found a different library that apparently works with tesseract v4 - we’ll use that in a future update).

Step 1. Capture a Screenshot

I’m using desktop-screenshot to capture a desktop screenshot.

Let’s take a screenshot:

    screenshot('image.png', function(error, complete){
    if(error){
        //uh oh
    } else {
        processImage()
    }
});

Nice and straightforward. However, this does capture the ENTIRE desktop. We’re only interested in the small part of that which is showing our values.

Step 2. Crop the Screenshot.

Here comes sharp to the rescue. sharp is an image processing library for node with a friendly api.

    sharp(imageName)
            .extract({width, height, top, left})
            .then(function(file){
                //hand cropped image to tesseract
            })

We’ve given sharp a width and height, as well as an offset from the top and the left which denotes the area of the screenshot that we care about. I’m not too fussed about this as you can move the UI around in the game. In a future version, maybe I will use a little draggable window to define the capture area.

Step 3. Convert the Image to Text

Time to finally call tesseract!

    tesseract.process('path/to/file'/ function(error, text){
        if(error){
            //uh oh
        } else {
            writeOCRTextToFile(text);
        }
    });

I think it’s time to just sit back and take a moment to reflect upon how good we have it as programmers. If you find yourself taking a library or open source project for granted, I suggest you re-examine your attitude. I have just used someone else’s code, for free to convert an image into text. Instant access to literal years worth of work.

Output:

level I shlp Equummmmm...

m

m"... I

shw Eomlmnun
Armor msxms
wwm msxms
M255 mm

Rgzdnvfienenhonkzk. 21119.9

Rum”: snmmnq um. I

m; .5 . Mm. m; (omvonenQvoweri .
of your on.“ 5h... (omvonenfi. u your
mum! m5 znv (omlmnun um wmm
ummu shumw... to en...» 5m

Well, that’s not too useful. Perhaps I spoke too soon? Let’s do a little image processing to help tesseract figure out what it’s looking at.

 sharp(imageName)
            .extract({width, height, top, left})
            .grayscale()
            .resize({ width: 1000 })
            .then(function(file)){
                ///
            });

We’re now converting the cropped image to black and white, and resizing it to be 1000px wide (given no second arguments, sharp maintains the aspect ratio).

Here’s the cropped screenshot after the additional processing: Screenshot

Let’s see if that’s improved anything.

Level 2 Ship Equipment Certification:

Yes
Volume: 1
Ship Component
Armor: 113.3/‘1133
Hitpoints: 1133/1133
Reactor Energy Drain: 998.?
Mass: 1654.8

Capacitor Energy: 648.3/‘6483
Recharge Rate: 25.3

Reverse Engineering Level: 2

This is a weapon capacitor. Every shot tila'
weapon fires will drain energy from the

Amazing, just a very quick tweak and things are becoming downright useful.

Not too bad, we’ll use it for now.

Step 4. Regex Time

Everybody’s favourite - Regex!

Okay, pretty straightforward - we want some numbers that occur after specific words and we want to save them into values.

E.g. var redRegex = /Drain:\s(\d*.\d)/gm; We want whatever decimal number that happens after the word ‘Drain’. We’ll then see if anything matches our redRegex pattern and assign it to a variable so we can compare it to the value in our lookup table.

Here is the lookup table for a Capacitor (it is responsible for how many times my spaceship’s laser can go pew pew)

var capacitorTable = {
    2: { red: 700, mass: 1300, energy: 950, recharge: 38 },
    4: { red: 800, mass: 3000, energy: 1000, recharge: 40 },
    6: { red: 800, mass: 7800, energy: 1300, recharge: 47 },
    10: { red: 800, mass: 38000, energy: 1600, recharge: 58 },
}
  • object key - What ‘Reverse Engineering Level’ the component is.
  • Red - Reactor Energy Drain. (smaller is better)
  • Mass - the mass of the component (smaller is better)
  • Energy - the maximum charge the capacitor can hold (bigger is better)
  • Recharge - how quickly the capacitor recharges (bigger is better)

5 The Numbers, What Do They mean?

The components in this game are all about balancing these values. Your spaceship has a mass that it can support. So for a given mass, e.g. 80,000 you want to put in the best components you can get that add up to less than that. Your ship also has a reactor, which can only generate so much energy. Components that have lower drain and mass, while still having good damage or speed, are the components we are interested in.

So now that we’ve pulled the values out and we have all the data in our lookup tables, it’s just a few straight if statements to see if we’ve got a component worth keeping.

My program uses chalk to log the values out to the console in different colours if they are above or below the thresholds defined in our lookup tables.

E.g. Console Output

THE END

We managed to build a needlessly complex program that achieves something I could do in my head with just a little more practice. But then, any programmer worth their salt should always be looking for ways to ‘simplify’ their life.