Medical Nerds Blog Logo
medicalnerds.com

technology, stats and IT for medics

 

R and Filemaker on Mac OS X

March 1st, 2007 by Mark · 4 Comments

R (The R project for statistical computing) and Filemaker provide a compelling solution for the common problems involved in medical research, namely data entry, reporting and analysis. However, there are pecularities in this combination, particularly when running on Mac OS X and trying to use ODBC, that can cause difficulties. This article discusses some solutions to these problems.

R is a powerful open-source statistical package that I would recommend to those seeking more control over their analyses compared to more traditional commercial statistics packages such as SPSS, which is in common use in the Department of Medicine at the University here. The powerful RODBC package provides (on all platforms) trivial access to data held in databases with ODBC drivers.

Filemaker is a powerful but unconventional relational database system. It has a number of advantages over Microsoft Acess: it runs on Mac OS X and Windows, one can create standalone “applications” that may be distributed royalty-free, it is intuitive and makes rapid database development a breeze. It also scales well, has small file-sizes (unless Microsoft Access) and can run as a server.

There are a number of approaches to transferring your data from a database into a statistical package. One can export the data required, and then import into R, SPSS or SAS. This is easy for an occasional analysis, but when one wants to make arbitrarily complex queries dynamically against a changing dataset, then I would (at present) recommend ODBC (Open Database Connectivity), especially if using Windows.

ODBC on the Mac has been a problem, but since Mac OS X 10.4 (Tiger), things have improved a great deal. The standard ODBC configuration in Mac OS X Panther was badly broken and would not adequately set-up ODBC drivers. There was no built-in ODBC functionality before Panther. One used to have to install a third-party ODBC driver, such as one from Openlink, but this is now unnecessary.

ODBC drivers on the Mac are an additional problem. Mac OS X Tiger supports ODBC, but do you have a driver for your database?

I store research data in a combination of Filemaker Pro 8.5 and PostgreSQL on Mac OS X. Filemaker includes ODBC drivers but at present, ODBC access to Filemaker Pro is haphazard and slow; the ODBC driver supports only a small fraction of full ODBC functionality, and running any kind of complex SQL query against a Filemaker database is hopelessly slow. I do not recommend it for production use.

The fix? Well, the next (Filemaker 9) is allegedly going to provide more sophisticated and powerful ODBC. In that case, it may be possible to use Filemaker as a ODBC data source. I do not know when this version will be released.

Until then, I would recommend exporting Filemaker data into a different database, and using that as an ODBC source. There are a number of alternative approaches to this, including writing Filemaker scripts to export data into some common intermediary data format, such as CSV. I think this is rather complicated and prone to error, so instead I recommend using Filemaker as an ODBC client, and issuing SQL commands from within Filemaker against a different database.

I have written two generic Filemaker scripts that export data from table currently associated with the open window, and hope you find them useful. You need to alter the ODBC script steps to configure ODBC for your specific database.

I don’t run these scripts manually, although they can work like that.
I have a looped script that cycles through a list of layouts specific to a database, and then runs these scripts one after the each for each layout. This is database specific, and I haven’t duplicated the code here as it probably won’t be useful for a general audience. If you have difficulties, then do get in touch.

Script: Create ODBC database table

This script creates a database table in the specified ODBC database, deleting if it already exists.
The table will be created, but of course, will contain no records.
It gets the name of the table associated with the current layout.
It then looks at all the fields and one by one builds up a SQL statement.
You will need to edit the “Execute SQL” steps.
There is a script step that evaluates field type and chooses the ODBC field type. In normal circumstances you should use the type as close to the Filemaker type as possible. However, you will notice that I export numbers into “text” fields because of a peculiarity with accessing postgresql on Mac OS X using R and RODBC via Actual’s PostgreSQL ODBC driver. You could set this to “numeric” if necessary.


Set Variable [ $table; Value:Get(LayoutTableName) ]
Set Variable [ $fieldNames; Value:// returns list of fields
// if this table has a field "gExportFields", then use that
// if not, get names of all fields for this table
Let(
[
$fieldNames = GetField("gExportFields");
$len = Length($fieldNames)
];
If ($len>1; $fieldNames; FieldNames(Get(FileName); $table) )) ]
# and now generate the SQL
Set Variable [ $fieldNameList; Value:Substitute ( $fieldNames ; "¶" ; ", " ) ]
Set Variable [ $num_fields; Value:PatternCount ( $fieldNames ; "¶" )+1 ]
Set Variable [ $sql; Value:"create table " & $table & "(" ]
Set Variable [ $f; Value:1 ]
Loop
Set Variable [ $field; Value:GetValue ( $fieldNames ; $f ) ]
# ignore field names containing ::, as these are in other tables
If [ Position ( $field ; "::" ; 1 ; 1 )=0 ]
Set Variable [ $fieldtype; Value:// this converts filemaker datatype to postgresql datatype
// note: for certain datatypes, it may be prudent to use ALTER TABLE liberally
// but this will do the brunt of the work for you!
// NOTE: RODBC has trouble with NAs for numeric/int types, so instead we store numbers as text
// and let R work it all out!
Let(
[
$type = MiddleWords(FieldType( Get(FileName); $field);2;1)
];
Case ( $type="text" ; "text";
//$type="number"; "numeric" ;
$type = "number"; "text"; // store numbers as text for RODBC NA problem
$type="date";"date";
$type="time";"time";
$type="timestamp";"timestamp";
$type="container";"bytea";
"text") // default to text
) ]
Set Variable [ $sql; Value:$sql & If ( $f>1 ; ", " ; "" ) & $field & " " & $fieldtype ]
End If
Set Variable [ $f; Value:$f+1 ]
Exit Loop If [ $f > $num_fields ]
End Loop
Set Variable [ $sql; Value:$sql & ")" ]
Set Error Capture [ On ]
Execute SQL [ ODBC Data Source: “database_name”; Calculated SQL Text: "drop table " & $table ]
[ No dialog ]
Set Error Capture [ Off ]
Execute SQL [ ODBC Data Source: “database_name”; Calculated SQL Text: $sql ]
[ No dialog ]

Export data

You run this script after running the previous script. It exports all data in the current resultset to the ODBC database. Notice that I have added code here that adds a log entry to record data/time of each table export. I use this data to later query how “fresh” my postgreSQL data is, and this can prompt me to re-synchronise PostgreSQL and Filemaker. You will normally wish to make sure you have previously selected “Show all records” and navigated to the first record in the recordset. This script defaults to exporting from the current record to the end of the recordset as this is sometimes useful to update a database record without having to export all the data in the table.


Set Error Capture [ Off ]
#
#
Set Variable [ $table; Value:Get(LayoutTableName) ]
Set Variable [ $fieldNames; Value:// returns list of fields
// if this table has a field "gExportFields", then use that
// if not, get names of all fields for this table
Let(
[
$fieldNames = GetField("gExportFields");
$len = Length($fieldNames)
];
If ($len>1; $fieldNames; FieldNames(Get(FileName); $table) )) ]
# and now insert data for the records
Set Variable [ $fieldNameList; Value:Substitute ( $fieldNames ; "¶" ; ", " ) ]
Set Variable [ $num_fields; Value:PatternCount ( $fieldNames ; "¶" )+1 ]
Loop
Set Variable [ $f; Value:1 ]
Set Variable [ $values; Value:"" ]
Loop
Set Variable [ $field; Value:GetValue ( $fieldNames ; $f ) ]
# ignore fields in related tables (::)
If [ Position ( $field ; "::" ; 1 ; 1 )=0 ]
Set Variable [ $newFieldList; Value:If ( $f=1 ; "" ; $newFieldList & ", " ) & $field ]
Set Variable [ $values; Value:// generate SQL-safe version of the data in $field
// if empty, then we return NULL
// if not empty, then we return the data with all single and double quotes escaped
// for some reason, Substitute() doesn't work with double-quotes
Let(
[
$v = GetField($field);
$len = Length($v)
]
;
$values & If ( $f=1 ; "" ; ", " ) & If ( $len>0 ; "'"&
Filter ( $v ; " abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789.,/:-")
&"'" ; "NULL" )
) ]
End If
Set Variable [ $f; Value:$f+1 ]
Exit Loop If [ $f > $num_fields ]
End Loop
Set Variable [ $sql; Value:"insert into " & $table & "(" & $newFieldList & ") values (" & $values & ")" ]
Execute SQL [ ODBC Data Source: “database_name”; Calculated SQL Text: $sql ]
[ No dialog ]
If [ Get(LastError)<>0 ]
Show Custom Dialog [ Title: "SQL error"; Message: $sql; Buttons: “OK” ]
Exit Script [ Result: -1 ]
End If
Go to Record/Request/Page
[ Next; Exit after last ]
End Loop
# and now update update_log table
Execute SQL [ ODBC Data Source: “database_name”; Calculated SQL Text: "insert into update_log (table_name) values ('" & Get
(LayoutTableName) & "')" ]
[ No dialog ]
Exit Script [ Result: 0 ]

You will need a valid database table called “update_log” to use this unchanged.

For postgreSQL, execute:


create table update_log (timestamp timestamp default now(), table_name text);

Notice that the default value for timestamp is the current time, as so the Filemaker script can safely just instead the name of the database table, and PostgreSQL will automatically timestamp its addition. My query (written in SQL and executed using R/ROBDC and Sweave to create a PDF-based report direct from the database) returns the “maximum” timestamp for each table in a well-presented table.

Tags: Filemaker · ODBC · R statistical computing

4 responses so far ↓

  • 1 Felipe Barbosa // May 14, 2008 at 3:26 am

    Hi.
    I’m a medical student, starting to use filemaker now. I want to know if there is a way to export filemaker data for SPSS or R in such way that text values (determined by a value list) appear in SPSS as a number with a value label:
    Value list on Filemaker:
    Male
    Female
    SPSS:
    0 Male
    1 Female

  • 2 Mark // Jul 9, 2008 at 9:39 pm

    Hi Felipe. Sorry, I only just saw your posting.

    Try using a relational join. Have a table that contains the contents of your value list, use that to populate the value list. In that table, include a number coding for each value. Then, you can export the related number coding via a table lookup for the value entered.

    Easy!

  • 3 William Cullerne Bown // Jun 25, 2009 at 2:23 pm

    Hi Mark. We’re using Filemaker to generate runtime desktop applications for universities interested in analysing data from things like the recent Research Assessment Exercise. I was interested in following up this post but can’t find a way to contact you! Briefly, how does the Filemaker – R linkage stack up now that we’re another year down the line? Also – could you email me so I can call you! Many thanks, W.

  • 4 Mark // Mar 4, 2010 at 9:09 pm

    Hi William. I did try to email you last year.

    I’ve recently retried this with Filemaker 10 and again had great difficulty directly accessing data from R using ODBC. There are alternatives, such as using intermediate CSV files output algorithmically from Filemaker using scripts similar to that above to dynamically output arbitrary tables or to use Filemaker sharing and parse XML.

    I have to say I’m disappointed that it isn’t easier – if there are Filemaker experts who know differently then please let me know. I now use custom web-based research databases and it is easy to connect to these from R as I have complete control of these data at both the database and application levels. Its unlikely I’ll invest more time in Filemaker given that.

Leave a Comment

(Don't forget to fill in the Captcha)