Category Archives: refactoring

The missing sources and data directory

As I said in previous posts, before delving more into the code base I decided to do a simple test: try to compile and link the first program run by the batch file, just to see what to expect.

Porting the .lnk file was quite easy. The original file was

BLINKER INCREMENTAL OFF

FI EXE01,FUNCKY,RESIDENT

SEA FUNBLINK

BEGINAREA

  FILE file01
  FILE file02
...
  FILE file40

  ALLOCATE OVERCL
  ALLOCATE APPLIB
  ALLOCATE FUNCKY
  ALLOCATE EXTEND
ENDAREA

LIB CLIPPER

The converted file is (with the first 3 lines not strictly necessary)

-gui
-gtwvt
-inc

-oEXE01

src/file01
src/file02
...
src/file40

I was sure I was headed to hit a wall… but I was deliberately looking for a failure!

The results were (from the log I saved):
hbmk2: Suggerimento: aggiungere l’opzione ‘hbxpp.hbc’ per la funzione(i) mancante: CurDrive()
hbmk2: Suggerimento: aggiungere l’opzione ‘hbct.hbc’ per la funzione(i) mancante: Center()
hbmk2: Suggerimento: aggiungere l’opzione ‘hbblink.hbc’ per la funzione(i) mancante: BliOvlClr()

And then other 30 missing functions.

Some that I found in Funcky docs: CLS(), ROLOC(), PRINT(), BORDER(), UNCLOCK24(), ONDO(), FLRESET(), CLOCK24(), ALEN(), CHDIR(), AMAXSTRLEN().

And some – I won’t name them – that were in that missing application lib I already told about in a previous post. Among the 700+ files present in the directory there were a couple generated by the linker that list the object files and where they come from so I could confirm I had no sources for them.
A couple of the function’s names referred to the screen: setting attributes, showing messages on screen, setting fonts (we are talking about EGA/VGA times….).
Others were file related functions, to open database files. Others were utility functions and a couple were clones of padr/padl/padc functions that were not present in Clipper 87.

Using the Funcky docs found online I was able to quickly rewrite the most important ones, with very basic functionality, discarding the ones like the clock (on-screen clock was good to have on text-mode programs, not in GUI programs…).

Looking at the code I was able to quickly create stub functions for the others: it has been a very interesting process, trying to understand by the function call which was their use, parameters and the expected return values.

During this process I had to look at several source code files and noticed that there were calls to functions that were expected to be in the missing library and that I had still not rewritten! More, the only missing functions were from file01.prg… possible that the other 39 files used no missing functions?

This surprised me a lot. So I swapped the order of the files in the .hbp file, and put file03.prg as the first one. Now the build process reported different missing functions – but always related to file03. I could not explain exactly why it was happening and had several hypothesis, none fully verified. For one of them I thought to have an answer: browsing the source code I saw that Funcky ondo() command was used.

Ondo has this syntax:

ONDO(n,'file03()','file04()','file05()','file06()','file07()')

Depending on the value of the n variable, it calls the function listed in the n+1 parameter, using the macro operator. This means that you must have it linked to work at runtime. My idea was that Blinker was linking them while mingw linker wasn’t… and some other strange ideas…

But this was not the problem. It’s really a lot easier: if there is only one missing function to link in the file currently being linked, the linker continues and reports errors from other object files. As soon it reports more than 1 error in a linked file, the other errors from other object files are not reported.

It’s impossible (unless there is a linker switch to use that I’m not aware of) to have a full reports of all the missing functions in all the files.

Doing the tests and looking at the blinker generated reports I found, I could list more than 20 functions I had no source code. I’m not sure the software house has these sources or can provide them to me. So I decided to do something I didn’t want to do: use a decompiler. I don’t own one but a trusted friend of mine does, so I sent one of the executables and I got back well formatted, easy to read, source files for the application. Using the blinker generated file I could recreate the source files reproducing the same structure as the original. For a moment I also wanted to use the decompiler to generate more readable code for ALL the source code but I fear that it may introduce problems – and I will lose all the comments, remarked code, and so on.

Among the decompiled functions there are the routines used to open data tables, create index, create temporary tables, lookup values. Some of them could apparently be easily rewritten, but some have some not easily understandable features or side effects. Original (well… decompiled) code is better.

Ok, so finally the program run, the main menu appeared on the screen but any option I selected would stop the program. I made sure the modules were correctly linked and now the program stopped after a few more program lines.

In the second post I talked about the structure of the data files: one root directory pointed by a environment variable and one subdirectory for each client/year and the data files that can be in both root and subdirectory (the latter has precedence). The main menu, when you choose an action and a workspace is not selected, lists all the available subdirs and then uses chdir() to set the working directory. Then, depending on the selection, executes the menu action or sets the errorlevel and exit the program letting the calling batch file to execute the proper executable.

The program could not locate the data files… so it arrived the time to start to look at the code.

The code of this part is not very complex but it gave me an idea of the difficulties I may need to solve in the future. Let’s have a look at the first few lines of this function, keeping in mind this is Clipper 87 code!

drive=CURDRIVE()+'\'
dir=SUBSTR(CURDIR(),4)

dir=SUBSTR(gete('ROOTDIR'),4)

n=AT('\',dir)
IF n>0
  drive=drive+LEFT(dir,n)
  dir=SUBSTR(direct,n+1)
ENDIF

ADIR(drive+'*.*',a6,.T.,.T.,.T.,a7)
nDir=ALEN(a7)

The first thing to note is that there is a really strange code: drive and dir variables are set, then dir variable is overwritten, but not drive. I know that ROOTDIR environment variable points to the root directory of the application and from the code I see that is quite mandatory: without that env variable aDir() reads the root directory of the current drive not the root directory of the application. It can be a valid setup but really not a good idea to have all the subdirectoirs in the drive root dir… unless it is a drive dedicated to the data.

My idea is to change this code to have the env variable mandatory and exit the program if not set. In this way, any data setup you want, you may have it. Now the code is just 2 lines (check of ROOTDIR presence is done at startup):

drive := gete( "ROOTDIR" ) 
ADIR(drive+'*.*',a6,.T.,.T.,.T.,a7)

I don’t know yet if this change is good or not since it needs to be validated with other issues I will introduce now.

The code after this snippet loops on the a6/a7 arrays and when it finds a directory it does:

 IF CHDIR( drive + a6[nIndex] )
   IF FILE( 'file1.DBF') .AND. FILE( 'file2.DBF')
      my_netuse( 'file1' )
      company = FILE1->companyname
      CLOSE ALL
      // not shown: add the data to the array to be used
      // later for aChoice
   ENDIF
 ENDIF

From this code I learn that file1 and file2 must reside in the subdirectory and that chdir() is used to set it. But with my previous browsing in the code I remembered to have seen something of interest…

Infact in one of the initialization functions, this code is executed:

_temp=GETE('ROOTDIR')
_path='.;'+_temp
SET PATH TO &_path

This code tells Clipper to try to open the data files in the current directory but if it doesn’t find them, to try in the data root directory.
Looking at the docs, the file() function respects this directive, and checks if the file is present in both directories…
my_netuse function opens the data file and can’t handle paths, infact use its parameter to name the alias:

 use &par alias &par

I hope you start to see the problem I’m facing: the code expects to be able to set a working directory AND to be able to keep this setting between executables (something I was not able to achieve using hb_cwd calls) and completely rely on Clipper to handle where the needed file actually is.

So, related to this particular aspect of the application, I’m starting to think about the following points:
– include all the code into one executable (instead of the 15/16 currently necessary)
– have a public variable to hold the current working directory, just for reference, if necessary
– change the program to use the (properly validated) directory in the env variable as the main data directory, with one subdirectory for each client/year; hb_cwd should work in one executable, so that no other changes in file handling code would be necessary.
– use Harbour, 2015 era, directory functions that must also work in linux

Since we have just one executable we won’t need to keep the settings between executables. I just hope that somewhere in the code there isn’t something stopping this idea…

Quest for formatted code

There was a post already scheduled to go online during the weekend that describes the first tests done with the code: just to check if the code was complete, which libraries used, if compiled, etc, I did some quick and dirty changes to the code. The changes were not documented, and done one after the other with the goal to have the code compile and link. Documenting the steps I probably misplaced something, so I want to redo everything in order, with proper documentation.

In the meanwhile I also started to document the progres in this blog so I took a decision: start from a clean situation and document all the changes. And the first step is to reformat the source code with a common standard. Actually there is code I received, code decompiled and code I wrote (the latter 2 will be described in next posts)

As I wrote, I received a compressed archive, decompressed it and moved files in different directories and then all of them (and I mean ALL of them, including useless clipper .obj files) were added to a mercurial repository. I also the decompiled code and now it is time to cleaning up the code.

The tool of choice is hbformat, of course. I already know that committing the hbformatted code will create a monster diff but I think we, as programmers, should work with source code that is easy to work with, pleasant to read…

And my idea was that a quick hbformat *.prg would be the solution… I had to change my idea…

hbformat *.prg

The result was depressing… There are at least 3 problems with hbformat running on this code.

The first problem is that code uses short 4-chars or abbreviated commands: among them ENDC is used instead of ENDCASE (or END). ENDC is not recognized by hbformat. Changing it manually to ENDCASE works ok.

The second problem is that it doesn’t reformat the code (all lines start at column 0) until it finds one procedure or function statement. And almost none of my 200 files has a procedure or function statement as first line of code, using implicit declaration.

The third one is that hbformat uses stdio (OutStd()) for progress report and stderr (OutErr()) for errors and there isn’t the filename in the error message. Since you can’t redirect stdio and stderr to the same file, it means that you have a file with the progress log and one with just the error type and line number… useless.
So I went to hbformat source code and added code to print the filename when an error occours:

OutErr( cFilename + ": error", oRef:nErr, "on line", oRef:nLineErr, ":", oRef:cLineErr, hb_eol() )

Now I know where the errors are! Problem 3 solved.

Now there are 87 files to manually check for errors, one by one… I don’t trust global source and replace…

After checking about 10 files I noticed they all had the ENDC command that was misleading hbformat logic, so I had a look at hbformat source code formatter routines to try to add support for ENDC but I stopped after 20 minutes… better changing by hand…

So, revert to the original files and from the error log, with a bit of vim magic, create a batch file to load all 87 files and start to manually change the files. But this time I noticed that the programmer also used OTHERW in a case statement. A quick check and I confirmed that OTHER was not recognized and not properly indented by hbformat.

Ok, after 20 minutes all ENDC and OTHERW were converted to the longer forms.

Now let’s address problem 2: add a procedure/function line to each source file so that hbformat can indent the code. As usual I will do it manually, so that I can do a little check of each file. In the process I discovered several duplicated files, some tests, some empty, some really strange. There is need for a cleanup. It took almost an hour but now the code is ready to be formatted:

hbformat *.prg

Finally, no errors reported. But is it the code properly formatted? Unfortunately it isn’t.

In the middle of the code there are these lines

EXTE HELPER, CALCOL
SET KEY K_F1 TO HELPER
SET KEY K_ALT_F1 TO CALCOL

EXTE is the abbreviation of EXTERNAL, used to tell the linker to include that functions. I checked in Harbour and preprocessor correctly converts the SET KEY to code blocks so it isn’t needed. I can’t be sure for Clipper 87. Anyway, in hbformat code, when EXTE/EXTERNAL is found, the internal state machine is reset and following lines are at column 0, like if we were out of a procedure. This is another problem of hbformat since these commands can be anywhere in the code.

I moved the EXTE directives at the top of the 16 source files in which they were present. Then a new formatting round:

hbformat *.prg

I was expecting that hg stat reported just the 16 modified files. Instead it reported more, for this kind of changes:
diff01

I browsed 30 files and found no visible errors in the formatting so, finally, it’s time to commit. The diff is a mess as expected but in the future I may strip the previous commits and have this one as the first.

Just to have the source code in a form that is easy, or confortable to work with, took several hours and needed a change in a tool to know exactly where the errors were. More changes are needed since the tool was not able to cope with completely valid Clipper/Harbour code: that time-consuming manual job was needed.

A first look at the code

After uncompressing the rar file, I created several directory and I moved the files according to their types: all source code files went to src directory, all the executables files to exe directory. Then lib, obj and other common extensions.

The software is a multi-company multi-year accounting package, the type used by accounting professionals. Each company/year has a dedicated directory, under a root directory. Data files can be in the root directory if they are valid for every company (for example fiscal deadlines, vat codes) or in the subdirectory (invoices, clients, payments…). If the data file is present in both directories, the subdirectory has precedence.

Code has a mix of spaces and tabs (but not always); it uses # instead of !=; SELE instead of SELECT, RETU instead of RETURN (and so on). Remark blocks are delimited by a strange combination of /* that makes my editor choke sometimes on syntax highligting. Single lines of remmed code are introduced by *– and again the editor doesn’t recognize them as remark lines.

Printing is supported natively only on dot-matrix printers and in the source you find something like

SET DEVICE TO PRINTER
@ y,x SAY "text"

My hope was that they were using some sort of library for printing so that I could replace it if it was not available for Harbour. Since I don’t want to touch code that makes calculations (and business and presentation code is mixed in the source code) I will accept the idea to print to a temporary file and then use an external program to convert it to PDF, as suggested in harbour-users mailing list.

In the directory I found one batch file that compiles all the source files, one by one, using a long series of

CLIPPER file.prg -l

command. This batch file compiles 178 files. But in the src directory I now have 201 prg files. And some of the 178 files are not present among the 201! More on this later.

There are 17 .lnk files. They are in blinker syntax and they can be easily converted to hbmk2. From these files I understand that the software uses Funcky library. It also uses an internal library. I know the company had several business software and probably they share a common library… unfortunately the source code for this library is not present.

The software is composed by various executables – probably not 17, but I think at least 10. They call each other using a batch file: the program that wants to call another one sets the errorlevel and exits. The batch file then calls the appropriate executable depending on the errorlevel value. In the batch file the first executable shows the menu and has also a lot of forms inside but not the main ones.

To complicate a bit the situation, the first exe called by the batch file, as first step, asks the user to choice a working directory: all the following executables will have their curdir() in that directory. An environment variable will tell the software which is the root (where the exe and the common files are). So we have the current working directory set to a subdirectory, a root directory, and the need to set the working directory in a way that is kept between executable calls. The software uses Funcky chdir() function, in Harbour I could not find an equivalent function that keeps the changed directory after exiting the program.

Browsing the code, I think that the programmer had cobol roots. I never used cobol myself but I could recently have a look at some cobol source code. I found it redundant. The programmer was smart, but not very used to Clipper. There is not a single #include in the 201 source code files! One file, included and run at the beginning of each executable, has a long list of PUBLIC variables named after common #define constants and values assigned to them… there are also copyright lines from Clipper include files! As an example:

PUBLIC K_INS
K_INS        =   22     && Ins, Ctrl-V

Now K_INS is a public variable and its value can, by error, be modified. And 22 can be changed in a different Clipper release… The same happens for DBedit, aChoiche and other constants.

I decided to try to compile the main program. The batch file told me which one is the first called and a quick look confirmed that it had the menu inside. The menu is implemented… well… let’s just say “implemented”. I have never seen code so strange: as I said, the original programmer was very smart and infact implemented something that really worked but it was not really Clipper.
Located the proper blinker .lnk file, I manually converted it to hbmk2 .hbp format, listing all the files. The compiler failed in a couple of places for some incompatibility with Clipper (more on this later) and I promptly modified the code until it compiled with no errors.

Obviously linking failed with several missing functions, some of them from Funcky library but some also from the library I don’t have sources.

Time to make up a strategy: what to do with the missing functions? What to do with the multiple executables? What about the mix of root/subdirectory? In the next posts, of course :-)

A Christmas gift

I’ve been given a gift this Christmas. A rar compressed file containing 782 files.

It’s the directory used to build an old Clipper software, whose development started more than 20 years ago in Clipper 87 and is still in use by a couple of companies.

One of the two is a company operated by a friend of mine. About 17 years ago I created a program to export data from my friend “ERP” to be imported in this one. Both were clipper-based and the software had an import function in place. Unfortunately not all the infos we needed to move were imported by the procedure so I did some “tricks” (with developers knowledge, guidance and approval): I went directly to the DBFs, searching and adding records. The only drawback was that the user had to force a reindex. Not bad. This system has been working for years…

This old version of the application has been retired, the developers said (a couple of years ago) that they were not going to support it or develop new features to it. All the codebase was rewritten (ported?) in a completely new language and using a SQL database.

Unfortunately the import procedure that is present in the new version doesn’t cover all my friend needs and I can’t use the trick to directly add to the data tables. The developers can’t add these new functionality in a short time…

Just to complicate the situation, my friend company is updating the hardware and switching to Windows 7 64 bit. The old Clipper 87 software is not going to work on that workstations.

So, during a meeting, I just said that perhaps it would be possible to port the software to Harbour and at the same time solve the 64 bit problem and update the printing system that now needs a dot matrix printer. The answer was a “let us think about it” and I interpreted it as a “no”.

Some days later, just before Christmas, I found a mail with the rar file in the mailbox.

Back to the gift.

The rar file is just one directory that includes everything to build the executables and support them, including dbase III+, clipper and blinker, libraries, object files, batch files and of course source files.

I’d like to document somehow the porting of this app from Clipper 87 to current Harbour; I can’t name the software nor post its source code or screenshots, but I think that publishing some lines may be ok.