Site Redesign

Written by David Craddock on January 14th, 2010

I’ve just updated the design of this blog, re-enabled comments and added a contact tab. I’ve installed a strong anti-spam comment filter, but you should now be able to comment on entries. I’ve also changed the layout of things slightly, and made it easier to read.

 

PHP Sample – HTML Page Fetcher and Parser

Written by David Craddock on January 14th, 2010

Back in 2008, I wrote a PHP class that fetched an arbitary URL, parsed it, and coverted it into an PHP object with different attributes for the different elements of the page. I recently updated it and sent it along to a company that wanted a programming example to show I could code in PHP.

I thought someone may well find a use for it – I’ve used the class in several different web scraping applications, and I found it handy. From the readme:

This is a class I wrote back in 2008 to help me pull down and parse HTML pages I updated it on
14/01/10 to print the results in a nicer way to the commandline.

- David Craddock (contact@davidcraddock.net)

/// WHAT IT DOES

It uses CURL to pull down a page from a URL, and sorts it into a 'Page' object
which has different attributes for the different HTML properties of the page
structure. By default it will also print the page object's properties neatly
onto the commandline as part of its unit test.

/// FILES

* README.txt - this file
* page.php - The PHP Class
* LIB_http.php - a lightweight external library that I used. It is just a very light wrapper around CURL's HTTP functions.
* expected-result.txt - output of the unit tests on my development machine
* curl-cookie-jar.txt - this file will be created when you run the page.php's unit test

/// SETUP

You will need CURL installed, PHP's DOMXPATH functions available, and the PHP
command line interface. It was tested on PHP5 on OSX.

/// RUNNING

Use the php commandline executable to run the page.php unit tests. IE:
$ php page.php

You should see a bunch of information being printed out, you can use:
$ php page.php > result.txt

That will output the info to result.txt so you can read it at will.

Here’s an example of one of the unit tests, which fetches this frontpage and parses it:

**++++
*** Page Print of http://www.davidcraddock.net ***
**++++

** Transfer Status
+ URL Retrieved:

http://www.davidcraddock.net

+ CURL Fetch Status:
Array
(
    [url] => http://www.davidcraddock.net
    [content_type] => text/html; charset=UTF-8
    [http_code] => 200
    [header_size] => 237
    [request_size] => 175
    [filetime] => -1
    [ssl_verify_result] => 0
    [redirect_count] => 0
    [total_time] => 1.490972
    [namelookup_time] => 5.3E-5
    [connect_time] => 0.175803
    [pretransfer_time] => 0.175812
    [size_upload] => 0
    [size_download] => 30416
    [speed_download] => 20400
    [speed_upload] => 0
    [download_content_length] => 30416
    [upload_content_length] => 0
    [starttransfer_time] => 0.714943
    [redirect_time] => 0
)

** Header
+ Title: Random Eye Movement
+ Meta Desc:
Not Set
+ Meta Keywords:
Not Set
+ Meta Robots:
Not Set
** Flags
+ Has Frames?:
FALSE
+ Has body content been parsed?:
TRUE

** Non Html Tags
+ Tags scanned for:
Tag Type: script tags processed: 4
Tag Type: embed tags processed: 1
Tag Type: style tags processed: 0

+ Tag contents:
Array
(
    [ script ] => Array
        (
            [0] => Array
                (
                    [src] => http://www.davidcraddock.net/wp-content/themes/this-just-in/js/ThemeJS.js
                    [type] =>
                    [isinline] =>
                    [content] =>
                )

            [1] => Array
                (
                    [src] => http://www.davidcraddock.net/wp-content/plugins/lifestream/lifestream.js
                    [type] => text/javascript
                    [isinline] =>
                    [content] =>
                )

            [2] => Array
                (
                    [src] =>
                    [type] =>
                    [isinline] => 1
                    [content] =>
                 var odesk_widgets_width = 340;
                var odesk_widgets_height = 230;

                )

            [3] => Array
                (
                    [src] => http://www.odesk.com/widgets/v1/providers/large/~~8f250a5e32c8d3fa.js
                    [type] =>
                    [isinline] =>
                    [content] =>
                )

            [count] => 4
        )

    [ embed ] => Array
        (
            [0] => Array
                (
                    [src] => http://www.youtube-nocookie.com/v/Fpm0m6bVfrM&hl=en&fs=1&rel=0
                    [type] => application/x-shockwave-flash
                    [isinline] =>
                    [content] =>
                )

            [count] => 1
        )

    [ style ] => Array
        (
            [count] => 0
        )

)

**----
*** Page Print of http://www.davidcraddock.net Finished ***
**----

If you want to download a copy, the file is below. If you find it useful for you, a pingback would be appreciated.

code-sample.tar.gz

 

Passed the W3Schools PHP Certification

Written by David Craddock on January 14th, 2010

As a break from my contract work, I took the W3Schools PHP Certification. I didn’t do any revision, which probably wasn’t wise. It was a bit more difficult than I thought, but I passed still.

 

Config files for the Windows version of VIM

Written by David Craddock on January 10th, 2010

Today I encountered problems configuring the windows version of the popular text editor VIM, so I thought I’d write up a quick post talk about configuration files under the Windows version, if anyone becomes stuck like I did. I use Linux, OSX and Windows on a day-to-day basis, and VIM as a text editor for a lot of quick edits on all three platforms. Here’s a quick comparison:

Linux

Linux is easy because that’s what most people who use VIM run, and so it is very well tested.

~/.vimrc – Configuration file for command line vim.
~/.gvimrc – Configuration file for gui vim.

OSX

OSX is simple also, as it’s based on unix:

~/.vimrc – Configuration file for command line vim.
~/.gvimrc – Configuration file for gui vim.

Windows

Windows is not easy at all.. it doesn’t have a unix file structure, and doesn’t have support for the unix hidden file names, that start with a ‘.’, ie: ‘.vimrc’, ‘.bashrc’, and so on. Most open-source programs like VIM that require these hidden configuration files, and have been ported over to windows, seem to adopt this naming convention: ‘_vimrc’, ‘_bashrc’.. and so forth. So:

_vimrc – Configuration file for command line vim.
_gvimrc – Configuration file for gui vim.

Renaming configuration files from “.” to “_” wouldn’t make much difference on its own. You’d have to rename your files, but.. big deal. It’s not much of a problem.

Another, more tricky, problem you may encounter however, is that there’s no clear home directory on windows systems. Each major incarnation of windows seems to have a slightly different way of dealing with user’s files.. from 2000 to XP, a change, from XP to Vista, there is a change. I haven’t tried VIM on W7 yet, but it seems similar to Vista in structure, so this information may actually be consistent to W7.

The Vista 64 version of VIM I have, looks in another place for configuration files. For a global configuration file, it looks in “C:\Program Files”. Yes.. “C:\Program Files”. According to Vista 64’s version of VIM.. that’s the exact directory where I installed VIM. This is clearly not right. What’s happening is that the file system on windows is different to the unix-type file systems, and the VIM port is having problems adapting. The real VIM install directory is C:\Program Files\vim72. Because VIM is looking for a global configuration file in “C:\Program Files\_vimrc”, it’ll never find it.

Now you could override this with a batch file that sets the right environmental variables on startup, or you could change the environmental variables exported in windows, but I prefer to have a user-specified configuration file in my personal files directory, as it’s easier to backup and manage. If you wanted to specify the environmental variables yourself, which I’m guessing many will, the two environmental variables to override are:

$VIM = the VIM install directory, not always set properly, as I mentioned.
$HOME = the logged in user’s documents and settings directory, in windows speak this is also where the ‘user profile’ is stored, which is a collection of settings and configurations for the user. The exact directory will depend on which version of Windows you’re running, and if you override the HOME folder, you may have problems with other programs that rely on it being static.

On my Windows Vista 64 install:

$VIM = “C:\Program Files”
$HOME = “C:\Users\Dave”

You can see what files VIM includes by running the handy command

vim -V

at a command prompt; it will go through the different settings and output something similar to this:

Searching for "C:\Users\Dave/vimfiles\filetype.vim"
Searching for "C:\Program Files/vimfiles\filetype.vim"
Searching for "C:\Program Files\vim72\filetype.vim"
line 49: sourcing "C:\Program Files\vim72\filetype.vim"
finished sourcing C:\Program Files\vim72\filetype.vim
continuing in C:\Users\Dave\_vimrc
Searching for "C:\Program Files/vimfiles/after\filetype.vim"
Searching for "C:\Users\Dave/vimfiles/after\filetype.vim"
Searching for "ftplugin.vim" in "C:\Users\Dave/vimfiles,C:\Program Files/vimfiles,C:\Program Files\vim72,C:\Program Files/vimfiles/after,C:\Users\Dave/vimfiles/after"
Searching for "C:\Users\Dave/vimfiles\ftplugin.vim"
Searching for "C:\Program Files/vimfiles\ftplugin.vim"
Searching for "C:\Program Files\vim72\ftplugin.vim"
line 49: sourcing "C:\Program Files\vim72\ftplugin.vim"
finished sourcing C:\Program Files\vim72\ftplugin.vim
continuing in C:\Users\Dave\_vimrc
Searching for "C:\Program Files/vimfiles/after\ftplugin.vim"
Searching for "C:\Users\Dave/vimfiles/after\ftplugin.vim"
finished sourcing $HOME\_vimrc
Searching for "plugin/**/*.vim" in "C:\Users\Dave/vimfiles,C:\Program Files/vimfiles,C:\Program Files\vim72,C:\Program Files/vimfiles/after,C:\Users\Dave/vimfiles/after"
Searching for "C:\Users\Dave/vimfiles\plugin/**/*.vim"
Searching for "C:\Program Files/vimfiles\plugin/**/*.vim"
Searching for "C:\Program Files\vim72\plugin/**/*.vim"
sourcing "C:\Program Files\vim72\plugin\getscriptPlugin.vim"
finished sourcing C:\Program Files\vim72\plugin\getscriptPlugin.vim
sourcing "C:\Program Files\vim72\plugin\gzip.vim"
finished sourcing C:\Program Files\vim72\plugin\gzip.vim
sourcing "C:\Program Files\vim72\plugin\matchparen.vim"
finished sourcing C:\Program Files\vim72\plugin\matchparen.vim
sourcing "C:\Program Files\vim72\plugin\netrwPlugin.vim"
finished sourcing C:\Program Files\vim72\plugin\netrwPlugin.vim
sourcing "C:\Program Files\vim72\plugin\rrhelper.vim"
finished sourcing C:\Program Files\vim72\plugin\rrhelper.vim
sourcing "C:\Program Files\vim72\plugin\spellfile.vim"
finished sourcing C:\Program Files\vim72\plugin\spellfile.vim
sourcing "C:\Program Files\vim72\plugin\tarPlugin.vim"
finished sourcing C:\Program Files\vim72\plugin\tarPlugin.vim
sourcing "C:\Program Files\vim72\plugin\tohtml.vim"
finished sourcing C:\Program Files\vim72\plugin\tohtml.vim
sourcing "C:\Program Files\vim72\plugin\vimballPlugin.vim"
finished sourcing C:\Program Files\vim72\plugin\vimballPlugin.vim
sourcing "C:\Program Files\vim72\plugin\zipPlugin.vim"
finished sourcing C:\Program Files\vim72\plugin\zipPlugin.vim
Searching for "C:\Program Files/vimfiles/after\plugin/**/*.vim"
Searching for "C:\Users\Dave/vimfiles/after\plugin/**/*.vim"
Reading viminfo file "C:\Users\Dave\_viminfo" info
Press ENTER or type command to continue

Notice how it does pull in all the syntax highlighting macros and other extension files correctly, which are specified in the .vim files above.. but it doesn’t pull in the global configuration files that I’ve copied also to C:\Program Files\vim72\_gvimrc and C:\Program Files\vim72\_vimrc. However, it does pickup the files I copied to C:\Users\Dave.. both the C:\Users\Dave\_vimrc and C:\Users\Dave\_gvimrc are picked up, although VIM will normally read ‘_gvimrc’ when the gui version of VIM is run (called gvim).

To see exactly what those environmental variables are being set to, when you’re inside the editor, issue these two commands, and their values will be show in the editor:

:echo $HOME
:echo $VIM

It seems to make sense for me – and perhaps you, if you’re working with VIM on windows – to place my _vimrc and _gvimrc files configuration files in $HOME in Vista. They are then picked up without having to worry about explicitly defining any environmental variables, creating a batch file, or any other hassle.

You can do this easily by the following two commands:

:ed $HOME\_vimrc
:sp $HOME\_gvimrc

That will open the two new configuration files, side by side, and you can paste in your existing configuration that you’ve used in Linux, and windows will pick them up the next time you start VIM.

 

oDesk and Work

Written by David Craddock on January 3rd, 2010

I’ve been so busy working lately, I’ve hardly had time to update this website. On top of other things, I’ve just started freelancing as a contractor on oDesk – which actually seems quite a good way of getting paid for working on projects at home. Here is my current oDesk profile:



It is a huge community, and I highly recommend the site to those who wish to work from home. You will become part of the global IT workforce, so you may have to lower your rates, but if you’re good at what you do, then you can easily earn a modest living from anywhere. Someone I know is working via oDesk while traveling around Asia, for example.