{"id":801,"date":"2015-02-26T17:12:07","date_gmt":"2015-02-27T00:12:07","guid":{"rendered":"http:\/\/homepages.uc.edu\/~yaozo\/wordpress\/?p=801"},"modified":"2015-02-26T17:12:07","modified_gmt":"2015-02-27T00:12:07","slug":"sqldf-sql-on-r-data-frames","status":"publish","type":"post","link":"https:\/\/zhuoyao.net\/index.php\/2015\/02\/26\/sqldf-sql-on-r-data-frames\/","title":{"rendered":"sqldf SQL on R data frames"},"content":{"rendered":"<p><i>To write it, it took three months; to conceive it \u2013 three minutes; to collect the data in it \u2013 all my life.<\/i> <a href=\"http:\/\/en.wikipedia.org\/wiki\/F._Scott_Fitzgerald\" rel=\"nofollow\">F. Scott Fitzgerald<\/a><\/p>\n<p><strong>Latest News<\/strong><\/p>\n<p>(1) sqldf 0.4-10 is on CRAN now. This is a bug fix release to provide further compatibility with the new version of RSQLite. Note that this version requires R (\u2265 3.1.0), gsubfn (\u2265 0.6), RSQLite (\u2265 1.0.0) and DBI (\u2265 0.2-5). If this is a problem for you and you want to use an older version of RSQLite, sqldf, etc. an easy way to revert is to use the checkpoint package:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">checkpoint<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\ncheckpoint<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"2014-10-08\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>as discussed here: <a href=\"http:\/\/stackoverflow.com\/questions\/26571232\/sqldf-not-working-after-update\" rel=\"nofollow\">http:\/\/stackoverflow.com\/questions\/26571232\/sqldf-not-working-after-update<\/a><\/p>\n<p>(2) The new RSQLite 1.0.0 changes how it deals with dots in names. They are no longer translated to underscores.<\/p>\n<p>(3) There is now an <a href=\"http:\/\/groups.google.com\/group\/sqldf\" rel=\"nofollow\">sqldf discussion group<\/a> to discuss sqldf (and other of my packages).<\/p>\n<p><strong>Introduction<\/strong><\/p>\n<p><a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">sqldf<\/a> is an R package for runing <a href=\"http:\/\/en.wikipedia.org\/wiki\/SQL\" rel=\"nofollow\">SQL statements<\/a> on R data frames, optimized for convenience. The user simply specifies an SQL statement in R using data frame names in place of table names and a database with appropriate table layouts\/schema is automatically created, the data frames are automatically loaded into the database, the specified SQL statement is performed, the result is read back into R and the database is deleted all automatically behind the scenes making the database&#8217;s existence transparent to the user who only specifies the SQL statement. Surprisingly this can at times <a href=\"http:\/\/stackoverflow.com\/questions\/1727772\/quickly-reading-very-large-tables-as-dataframes-in-r\/1820610#1820610\" rel=\"nofollow\">be<\/a> <a href=\"http:\/\/groups.google.com\/group\/manipulatr\/browse_thread\/thread\/3affbdc5efca9143\/d19d7b97ac023ee8?pli=1\" rel=\"nofollow\">even<\/a> <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2009-December\/221456.html\" rel=\"nofollow\">faster<\/a> <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2009-December\/221513.html\" rel=\"nofollow\">than<\/a> <a href=\"http:\/\/stackoverflow.com\/questions\/14283566\/specific-for-loop-too-slow-in-r\/14287476#14287476\" rel=\"nofollow\">the<\/a> corresponding pure R calculation (although the purpose of the project is convenience and not speed). <a href=\"http:\/\/brusers.tumblr.com\/post\/59706993506\/data-manipulation-with-sqldf-paul\" rel=\"nofollow\">This link<\/a> suggests that for aggregations over highly granular columns that sqldf is faster than another alternative tried. <tt>sqldf<\/tt> is free software published under the GNU General Public License that can be downloaded from <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">CRAN<\/a>.<\/p>\n<p>sqldf supports (1) the <a href=\"http:\/\/www.sqlite.org\/\" rel=\"nofollow\">SQLite<\/a> backend database (by default), (2) the <a href=\"http:\/\/www.h2database.com\/\" rel=\"nofollow\">H2<\/a> java database, (3) the <a href=\"http:\/\/www.postgresql.org\/\" rel=\"nofollow\">PostgreSQL<\/a> database and (4) sqldf 0.4-0 onwards also supports <a href=\"http:\/\/www.mysql.org\/\" rel=\"nofollow\">MySQL<\/a>. SQLite, H2, MySQL and PostgreSQL are free software. SQLite and H2 are embedded serverless zero administration databases that are included right in the R driver packages, <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RSQLite\/index.html\" rel=\"nofollow\">RSQLite<\/a> and <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RH2\/index.html\" rel=\"nofollow\">RH2<\/a>, so that there is no separate installation for either one. A number of<a href=\"http:\/\/www.sqlite.org\/famous.html\" rel=\"nofollow\">high profile projects<\/a> use SQLite. (Also see this <a href=\"http:\/\/www.viddler.com\/explore\/rentzsch\/videos\/25\/\" rel=\"nofollow\">lecture<\/a>.) H2 is a java database which contains a large collection of SQL functions and supports Date and other data types. It is the most popular database package among <a href=\"http:\/\/www.takipiblog.com\/2013\/12\/26\/the-top-100-most-popular-scala-libraries-based-on-10000-github-projects\/\" rel=\"nofollow\">scala packages<\/a>. PostgreSQL is a client\/server database and unlike SQLite and H2 must be separately installed but it has a particularly powerful version of SQL, e.g. its <a href=\"http:\/\/developer.postgresql.org\/pgdocs\/postgres\/tutorial-window.html\" rel=\"nofollow\">window<\/a> <a href=\"http:\/\/developer.postgresql.org\/pgdocs\/postgres\/functions-window.html\" rel=\"nofollow\">functions<\/a>, so the extra installation work can be worth it. sqldf supports the <tt>RPostgreSQL<\/tt> driver in R. Like PostgreSQL, MySQL is a client server database that must be installed independently so its not as easy to install as SQLite or H2 but its very popular and is widely used as the back end for web sites.<\/p>\n<p>The information below mostly concerns the default SQLite database. The use of H2 with sqldf is discussed in <a href=\"http:\/\/code.google.com\/p\/sqldf\/#10.__What_are_some_of_the_differences_between_using_SQLite_and_H\" rel=\"nofollow\">FAQ #10<\/a> which discusses differences between using sqldf with SQLite and H2 and also shows how to modify the code in the <a href=\"https:\/\/code.google.com\/p\/sqldf\/#Examples\">Examples<\/a> section to use sqldf\/H2 rather than sqldf\/SQLite. There is some information on using PostgreSQL with sqldf in <a href=\"http:\/\/code.google.com\/p\/sqldf\/#12._How_does_one_use_sqldf_with_PostgreSQL?\" rel=\"nofollow\">FAQ #12<\/a> and an example in <a href=\"http:\/\/code.google.com\/p\/sqldf\/#Example_17._Lag\" rel=\"nofollow\">Example 17. Lag<\/a> . The unit tests provide examples that can work with all five data base drivers (covering four databases) supported by sqldf. They are run by loading whichever database is to be tested (SQLite is the default) and running: <tt>demo(\"sqldf-unitTests\")<\/tt><\/p>\n<ul>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Overview\">Overview<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Citing_sqldf\">Citing sqldf<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#For_Those_New_to_R\">For Those New to R<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#News\">News<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Troubleshooting\">Troubleshooting<\/a>\n<ul>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Problem_is_that_installer_gives_message_that_sqldf_is_not_availa\">Problem is that installer gives message that sqldf is not available<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Problem_with_no_argument_form_of_sqldf_-_sqldf()\">Problem with no argument form of sqldf &#8211; sqldf()<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Problem_involvling_tcltk\">Problem involvling tcltk<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#FAQ\">FAQ<\/a>\n<ul>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#1._How_does_sqldf_handle_classes_and_factors?\">1. How does sqldf handle classes and factors?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#2._Why_does_sqldf_seem_to_mangle_certain_variable_names?\">2. Why does sqldf seem to mangle certain variable names?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#3._Why_does_sqldf(&quot;select_var(x)_from_DF&quot;)_not_work?\">3. Why does sqldf(&#8220;select var(x) from DF&#8221;) not work?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#4._How_does_sqldf_work_with_&quot;Date&quot;_class_variables?\">4. How does sqldf work with &#8220;Date&#8221; class variables?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#5._I_get_a_message_about_the_tcltk_package_being_missing.\">5. I get a message about the tcltk package being missing.<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#6._Why_are_there_problems_when_we_use_table_names_or_column_name\">6. Why are there problems when we use table names or column names that are the same except for case?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#7._Why_are_there_messages_about_MySQL?\">7. Why are there messages about MySQL?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#8._Why_am_I_having_problems_with_update?\">8. Why am I having problems with update?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#9._How_do_I_examine_the_layout_that_SQLite_uses_for_a_table?_whi\">9. How do I examine the layout that SQLite uses for a table? which tables are in the database? which databases are attached?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#10.__What_are_some_of_the_differences_between_using_SQLite_and_H\">10. What are some of the differences between using SQLite and H2 with sqldf?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#11._Why_am_I_having_difficulty_reading_a_data_file_using_SQLite\">11. Why am I having difficulty reading a data file using SQLite and sqldf?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#12._How_does_one_use_sqldf_with_PostgreSQL?\">12. How does one use sqldf with PostgreSQL?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#13._How_does_one_deal_with_quoted_fields_in_read.csv.sql_?\">13. How does one deal with quoted fields in read.csv.sql ?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#14._How_does_one_read_files_where_numeric_NAs_are_represented_as\">14. How does one read files where numeric NAs are represented as missing empty fields?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#15._Why_do_certain_calculations_come_out_as_integer_rather_than\">15. Why do certain calculations come out as integer rather than double?<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#16._How_can_one_read_a_file_off_the_net_or_a_csv_file_in_a_zip_f\">16. How can one read a file off the net or a csv file in a zip file?<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Examples\">Examples<\/a>\n<ul>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_1._Ordering_and_Limiting\">Example 1. Ordering and Limiting<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_2._Averaging_and_Grouping\">Example 2. Averaging and Grouping<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_3._Nested_Select\">Example 3. Nested Select<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_4._Join\">Example 4. Join<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_5._Insert_Variables\">Example 5. Insert Variables<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_6._File_Input\">Example 6. File Input<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_7._Nested_Select\">Example 7. Nested Select<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_8._Specifying_File_Format\">Example 8. Specifying File Format<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_9.__Working_with_Databases\">Example 9. Working with Databases<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_10._Persistent_Connections\">Example 10. Persistent Connections<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_11._Between_and_Alternatives\">Example 11. Between and Alternatives<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_12._Combine_two_files_in_permanent_database\">Example 12. Combine two files in permanent database<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_13._read.csv.sql_and_read.csv2.sql\">Example 13. read.csv.sql and read.csv2.sql<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_14._Use_of_spatialite_library_functions\">Example 14. Use of spatialite library functions<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_15._Use_of_RSQLite.extfuns_library_functions\">Example 15. Use of RSQLite.extfuns library functions<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_16._Moving_Average\">Example 16. Moving Average<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_17._Lag\">Example 17. Lag<\/a><\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_17._MySQL_Schema_Information\">Example 17. MySQL Schema Information<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"https:\/\/code.google.com\/p\/sqldf\/#Links\">Links<\/a><\/li>\n<\/ul>\n<h1><a name=\"Overview\"><\/a>Overview<\/h1>\n<p><a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">sqldf<\/a> is an R package for running <a href=\"http:\/\/en.wikipedia.org\/wiki\/SQL\" rel=\"nofollow\">SQL statements<\/a> on R data frames, optimized for convenience. <tt>sqldf<\/tt> works with the <a href=\"http:\/\/www.sqlite.org\/\" rel=\"nofollow\">SQLite<\/a>, <a href=\"http:\/\/www.h2database.com\/\" rel=\"nofollow\">H2<\/a>, <a href=\"http:\/\/www.postgresql.org\/\" rel=\"nofollow\">PostgreSQL<\/a>or <a href=\"http:\/\/dev.mysql.com\/doc\/\" rel=\"nofollow\">MySQL<\/a> databases. SQLite has the least prerequisites to install. H2 is just as easy if you have Java installed and also supports Date class and a few additional functions. PostgreSQL notably supports Windowing functions providing the SQL analogue of the R ave function. MySQL is a particularly popular database that drives many web sites.<\/p>\n<p>More information can be found from within R by installing and loading the sqldf package and then entering <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/sqldf.pdf\" rel=\"nofollow\">?sqldf<\/a> and <a href=\"http:\/\/sqldf.googlecode.com\/svn\/trunk\/man\/sqldf-package.Rd\" rel=\"nofollow\">?read.csv.sql<\/a>. A number of<a href=\"https:\/\/code.google.com\/p\/sqldf\/#Examples\">examples<\/a> are on this page and more examples are accessible from within R in the examples section of the <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/sqldf.pdf\" rel=\"nofollow\">?sqldf<\/a> help page.<\/p>\n<p>As seen from this example which uses the built in <tt>BOD<\/tt> data frame:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from BOD where Time &gt; 4\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>with <tt>sqldf<\/tt> the user is freed from having to do the following, all of which are automatically done:<\/p>\n<ul>\n<li>database setup<\/li>\n<li>writing the <tt>create table<\/tt> statement which defines each table<\/li>\n<li>importing and exporting to and from the database<\/li>\n<li>coercing of the returned columns to the appropriate class in common cases<\/li>\n<\/ul>\n<p>It can be used for:<\/p>\n<ul>\n<li>learning SQL if you know R<\/li>\n<li>learning R if you know SQL<\/li>\n<li>as an alternate syntax for data frame manipulation, particularly for purposes of speeding these up, since sqldf with SQLite as the underlying database is often faster than performing the same manipulations in straight R<\/li>\n<li>reading portions of large files into R without reading the entire file (example 6b and example 13 below show two different ways and examples 6e, 6f below show how to read random portions of a file)<\/li>\n<\/ul>\n<p>In the case of SQLite it consists of a thin layer over the <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RSQLite\" rel=\"nofollow\">RSQLite<\/a> <a href=\"http:\/\/cran.r-project.org\/web\/packages\/DBI\" rel=\"nofollow\">DBI<\/a> interface to SQLite itself.<\/p>\n<p>In the case of H2 it works on top of the <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RH2\" rel=\"nofollow\">RH2<\/a> <a href=\"http:\/\/cran.r-project.org\/web\/packages\/DBI\" rel=\"nofollow\">DBI<\/a> driver which in turn uses RJDBC and JDBC to interface to H2 itself.<\/p>\n<p>In the case of PostgreSQL it works on top of the <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RPostgreSQL\" rel=\"nofollow\">RPostgreSQL<\/a> <a href=\"http:\/\/cran.r-project.org\/web\/packages\/DBI\" rel=\"nofollow\">DBI<\/a> driver.<\/p>\n<p>There is also some untested code in sqldf for use with the <a href=\"http:\/\/www.mysql.com\/\" rel=\"nofollow\">MySQL<\/a> database using the <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RMySQL\" rel=\"nofollow\">RMySQL<\/a> <a href=\"http:\/\/cran.r-project.org\/web\/packages\/DBI\" rel=\"nofollow\">DBI<\/a> driver.<\/p>\n<h1><a name=\"Citing_sqldf\"><\/a>Citing sqldf<\/h1>\n<p>To get information on how to cite <tt>sqldf<\/tt> in papers, issue the R commands:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\ncitation<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"sqldf\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<h1><a name=\"For_Those_New_to_R\"><\/a>For Those New to R<\/h1>\n<p>If you have not used R before and want to try sqldf with SQLite, <a href=\"http:\/\/www.r-project.org\/\" rel=\"nofollow\">google for single letter R<\/a>, download R, install it on Windows, Mac or UNIX\/Linux and then start R and at R console enter this:<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># installs everything you need to use sqldf with SQLite<\/span>\n<span class=\"com\"># including SQLite itself<\/span><span class=\"pln\">\ninstall<\/span><span class=\"pun\">.<\/span><span class=\"pln\">packages<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"sqldf\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"com\"># shows built in data frames<\/span><span class=\"pln\">\ndata<\/span><span class=\"pun\">()<\/span> \n<span class=\"com\"># load sqldf into workspace<\/span><span class=\"pln\">\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from iris limit 5\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select count(*) from iris\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select Species, count(*) from iris group by Species\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"com\"># create a data frame<\/span><span class=\"pln\">\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">5<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> b <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> letters<\/span><span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">5<\/span><span class=\"pun\">])<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from DF\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select avg(a) mean, variance(a) var from DF\"<\/span><span class=\"pun\">)<\/span> <span class=\"com\"># see example 15<\/span><\/pre>\n<p>To try it with H2 rather than SQLite the process is similar. Ensure that you have the <a href=\"http:\/\/java.sun.com\/\" rel=\"nofollow\">java<\/a> runtime installed, install R as above and start R. From within R enter this ensuring that the version of RH2 that you have is RH2 0.1-2.6 or later:<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># installs everything including H2<\/span><span class=\"pln\">\ninstall<\/span><span class=\"pun\">.<\/span><span class=\"pln\">packages<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"sqldf\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dep <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">)<\/span>\n<span class=\"com\"># load RH2 driver and sqldf into workspace<\/span><span class=\"pln\">\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">RH2<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\npackageVersion<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"RH2\"<\/span><span class=\"pun\">)<\/span> <span class=\"com\"># should be version 0.1-2-6 or later<\/span><span class=\"pln\">\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"com\">#<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from iris limit 5\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select count(*) from iris\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select Species, count(*) from iris group by Species\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">5<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> b <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> letters<\/span><span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">5<\/span><span class=\"pun\">])<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from DF\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select avg(a) mean, var_samp(a) var from DF\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<h1><a name=\"News\"><\/a>News<\/h1>\n<p>October 27, 2014. sqldf 0.4-9 is now on CRAN and is propagating to the mirrors. It address RSQLite 1.0.0 which introduced incompatiblities with prior versions of RSQLite. Also note that RSQLite 1.0.0 no longer translates dots in column names to underscores.<\/p>\n<p>January 20, 2014. sqldf 0.4-7 released to address changes in R for R 3.0.<\/p>\n<p>March 28, 2012. <a href=\"http:\/\/cran.r-project.org\/package=sqldf\" rel=\"nofollow\">sqldf 0.4-6.4<\/a> has been uploaded to <a href=\"http:\/\/cran.r-project.org\/package=sqldf\" rel=\"nofollow\">CRAN<\/a>. See <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/NEWS\" rel=\"nofollow\">NEWS file<\/a>.<\/p>\n<p>December 19, 2011. <a href=\"http:\/\/cran.r-project.org\/package=sqldf\" rel=\"nofollow\">sqldf 0.4-6.1<\/a> has been uploaded to <a href=\"http:\/\/cran.r-project.org\/package=sqldf\" rel=\"nofollow\">CRAN<\/a>. It fixes a minor bug.<\/p>\n<p>December 10, 2011. <a href=\"http:\/\/cran.r-project.org\/package=sqldf\" rel=\"nofollow\">sqldf 0.4-6<\/a> has been uploaded to CRAN. See <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/NEWS\" rel=\"nofollow\">NEWS file<\/a>.<\/p>\n<p>December 1, 2011. Some changes to <a href=\"http:\/\/code.google.com\/p\/sqldf\/#4._How_does_sqldf_work_with_&quot;Date&quot;_class_variables?\" rel=\"nofollow\">FAQ #4<\/a> have been made to incorporate the improvements in RSQLite 0.11.0 .<\/p>\n<p>November 28, 2011. <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RH2\/index.html\" rel=\"nofollow\">RH2 0.1-2.8<\/a> has been uploaded to CRAN. It includes a new version, 1.3.162, of H2.<\/p>\n<p>November 22, 2011. <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RPostgreSQL\/\" rel=\"nofollow\">RPostgreSQL<\/a> support has been added to sqldf in the <a href=\"http:\/\/code.google.com\/p\/sqldf\/source\/checkout\" rel=\"nofollow\">sqldf development version<\/a>.<\/p>\n<p>November 21, 2011. <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">sqldf 0.4-5<\/a> is now on CRAN and should propagate to the mirrors shortly. See <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/NEWS\" rel=\"nofollow\">NEWS<\/a>.<\/p>\n<p>November 15, 2011. <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">sqldf 0.4-4<\/a> has been uploaded to CRAN. The primary <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/NEWS\" rel=\"nofollow\">new feature<\/a> is the inclusion of a gawk program, <a href=\"http:\/\/sqldf.googlecode.com\/svn\/trunk\/inst\/csv.awk\" rel=\"nofollow\">csv.awk<\/a>, which can transform input files by removing quotes surrounding fields, unescaping embedded quotes and replacing field separators with different separators. See the example <a href=\"https:\/\/code.google.com\/p\/sqldf\/#13._How_does_one_deal_with_quoted_fields_in_read.csv.sql_?\">here<\/a> and also see <tt>?sqldf<\/tt> from within R. Added later: Note that a bug was found in this awk program &#8212; try the<tt>csvfix<\/tt> program instead.<\/p>\n<p>November 5, 2011. <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">sqldf 0.4-3<\/a> has been uploaded to CRAN. This version allows the <tt>file<\/tt> argument to be omitted in <tt>read.csv.sql<\/tt> if <tt>filter<\/tt> is specified and no file input is needed. (Previously it had to be specified as &#8220;NUL&#8221; or &#8220;\/dev\/null&#8221; depending on OS.) Also, if the <tt>file<\/tt> argument begins with &#8220;http:&#8221; or &#8220;ftp:&#8221; in those commands then it first downloads the file before reading it into sqlite. See <a href=\"http:\/\/code.google.com\/p\/sqldf\/#16._How_can_one_read_a_file_off_the_net_or_a_csv_file_in_a_zip_f\" rel=\"nofollow\">FAQ #16<\/a>.<\/p>\n<p>October 20, 2011. <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RH2\/index.html\" rel=\"nofollow\">RH2 0.1-2.7<\/a> has been uploaded to CRAN. This version is a bug fix release.<\/p>\n<p>August 8, 2011. <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">sqldf 0.4-2<\/a> has been uploaded to CRAN. This version adds the <tt>nrows<\/tt> and <tt>field.types<\/tt> arguments to <tt>read.csv.sql<\/tt> and<tt>read.csv2.sql<\/tt>.<\/p>\n<p>July 30, 2011. <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RH2\/index.html\" rel=\"nofollow\">RH2 0.1-2.6<\/a> has been uploaded to CRAN. This version corrects a documentation bug.<\/p>\n<p>July 23, 2011. RH2 0.1-2.5 is on <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RH2\/index.html\" rel=\"nofollow\">on CRAN<\/a>. It should appear on the mirrors shortly. A significant change in RH2 is that it includes H2 1.3.158 which no longer requires that built in function names be upper case.<\/p>\n<p>July 23, 2011. sqldf 0.4-1.2 is on <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">on CRAN<\/a>. It should appear on the mirrors shortly. This version is a bug fix version.<\/p>\n<p>June 28, 2011. sqldf 0.4-1 is <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">on CRAN<\/a>. See <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/NEWS\" rel=\"nofollow\">NEWS<\/a> for changes.<\/p>\n<p>June 15, 2011. sqldf 0.4-0 is <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">on CRAN<\/a>. See <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/NEWS\" rel=\"nofollow\">NEWS<\/a> for a list of changes.<\/p>\n<p>May 24, 2011. The <a href=\"http:\/\/code.google.com\/p\/sqldf\/source\/checkout\" rel=\"nofollow\">development version of sqldf<\/a> now has MySQL support. It now also has a unit test suite that can be used with svUnit. The test suite works with any of RSQLite, RH2, RMySQL and RpgSQL driver packages.<\/p>\n<p>May 11, 2011. A new version of the <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RpgSQL\/index.html\" rel=\"nofollow\">RpgSQL<\/a> postgresql driver supported by sqldf is now on CRAN. See the RpgSQL <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RpgSQL\/NEWS\" rel=\"nofollow\">NEWS<\/a> file.<\/p>\n<p>March 7, 2011. A new version of the <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RH2\/index.html\" rel=\"nofollow\">RH2<\/a> driver, version 0.1-2.3, has been uploaded to CRAN. It includes a workaround for the problem that the RJDBC driver which RH2 uses reads NULLs into R in numeric database fields as 0. This change fixes that so that they are read into R as NA.<\/p>\n<p>December 16, 2010. A new example has been added below. See <a href=\"http:\/\/code.google.com\/p\/sqldf\/#Example_17._Lag\" rel=\"nofollow\">Example 17. Lag<\/a> .<\/p>\n<p>October 2, 2010. A new version of the <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RpgSQL\/index.html\" rel=\"nofollow\">RpgSQL<\/a> postgresql driver supported by sqldf is now on CRAN. See the RpgSQL <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RpgSQL\/NEWS\" rel=\"nofollow\">NEWS<\/a> file.<\/p>\n<p>August 30, 2010. The development source allows the <tt>to.df<\/tt> argument of <tt>sqldf<\/tt> to be a function or the character string <tt>\"name__class\"<\/tt> (as well as the previously allowed values of NULL, &#8220;raw&#8221; and &#8220;auto&#8221;). If <tt>\"name__class\"<\/tt> is specified then instead of the usual class assignment heuristic<tt>sqldf<\/tt> uses the column names to determine class. Any column name of the form <tt>\"x__y\"<\/tt> where <tt>y<\/tt> is some R class, e.g. <tt>\"mydate__Date\"<\/tt>, is converted to that class and the suffix is removed. If a function is used as the value of the <tt>method<\/tt> argument then it is called by <tt>sqldf<\/tt> passing the data frame prior to class conversion as its first argument. This provides a way for user transformations to hook into <tt>sqldf<\/tt>. e.g.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a_Date <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">0<\/span><span class=\"pun\">:<\/span><span class=\"lit\">1<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> b_POSIXct <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">0<\/span><span class=\"pun\">:<\/span><span class=\"lit\">1<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> c <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">0<\/span><span class=\"pun\">:<\/span><span class=\"lit\">1<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from DF\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> method <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"name_class\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0a \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 b c\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">1970<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">1970<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">00<\/span><span class=\"pun\">:<\/span><span class=\"lit\">00<\/span><span class=\"pun\">:<\/span><span class=\"lit\">00<\/span> <span class=\"lit\">0<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">1970<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span> <span class=\"lit\">1970<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">00<\/span><span class=\"pun\">:<\/span><span class=\"lit\">00<\/span><span class=\"pun\">:<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">1<\/span>\n<span class=\"pun\">&gt;<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\">## same<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> options<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">.<\/span><span class=\"pln\">method <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"name_class\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from DF\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0a \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 b c\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">1970<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">1970<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">00<\/span><span class=\"pun\">:<\/span><span class=\"lit\">00<\/span><span class=\"pun\">:<\/span><span class=\"lit\">00<\/span> <span class=\"lit\">0<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">1970<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span> <span class=\"lit\">1970<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">00<\/span><span class=\"pun\">:<\/span><span class=\"lit\">00<\/span><span class=\"pun\">:<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">1<\/span>\n<span class=\"pun\">&gt;<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> processDates <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">function<\/span><span class=\"pun\">(<\/span><span class=\"pln\">data<\/span><span class=\"pun\">,<\/span> <span class=\"pun\">...)<\/span> <span class=\"pun\">{<\/span>\n<span class=\"pun\">+<\/span><span class=\"pln\"> ix <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> grepl<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"_date$\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> names<\/span><span class=\"pun\">(<\/span><span class=\"pln\">data<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">+<\/span><span class=\"pln\"> names<\/span><span class=\"pun\">(<\/span><span class=\"pln\">data<\/span><span class=\"pun\">)[<\/span><span class=\"pln\">ix<\/span><span class=\"pun\">]<\/span> <span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">sub<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"_date$\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> names<\/span><span class=\"pun\">(<\/span><span class=\"pln\">data<\/span><span class=\"pun\">)[<\/span><span class=\"pln\">ix<\/span><span class=\"pun\">])<\/span>\n<span class=\"pun\">+<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">[<\/span><span class=\"pln\">ix<\/span><span class=\"pun\">]<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> lapply<\/span><span class=\"pun\">(<\/span><span class=\"pln\">data<\/span><span class=\"pun\">[<\/span><span class=\"pln\">ix<\/span><span class=\"pun\">],<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Date<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> origin <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"1970-01-01\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">+<\/span><span class=\"pln\"> data\n<\/span><span class=\"pun\">+<\/span> <span class=\"pun\">}<\/span>\n\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a_date <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">0<\/span><span class=\"pun\">:<\/span><span class=\"lit\">1<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> c <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">0<\/span><span class=\"pun\">:<\/span><span class=\"lit\">1<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from DF2\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> method <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> processDates<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0a c\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">1970<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">0<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">1970<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span> <span class=\"lit\">1<\/span><\/pre>\n<p>August 21, 2010. A new example has been added below. See <a href=\"http:\/\/code.google.com\/p\/sqldf\/#Example_16._Moving_Average\" rel=\"nofollow\">Example 16. Moving Average<\/a> .<\/p>\n<p>June 5, 2010. A new example has been added below. See <a href=\"http:\/\/code.google.com\/p\/sqldf\/#Example_15._Use_of_RSQLite.extfuns_library_functions\" rel=\"nofollow\">Example 15. Use of RSQLite.extfuns package library functions<\/a> .<\/p>\n<p>June 5, 2010. Version 0.3-5 of sqldf has been uploaded to CRAN. See <a href=\"http:\/\/sqldf.googlecode.com\/svn\/trunk\/inst\/NEWS\" rel=\"nofollow\">NEWS file<\/a>.<\/p>\n<p>April 16, 2010. Added <a href=\"http:\/\/pages.citebite.com\/c2p3y1i2y2btv\" rel=\"nofollow\">example 4j Per Group Min and Max<\/a> on this page.<\/p>\n<p>March 16, 2010. gsubfn which sqldf depends on has come out with a new version, gsubfn 0.5-1, that can run without tcltk. That means sqldf can also run without tcltk now if tcltk is not found. tcltk is still suggested and parsing of the SQL command will be faster if tcltk is available.<\/p>\n<p>March 15, 2010. sqldf discussed in this January 2010 <a href=\"http:\/\/analisisydecision.es\/monografico-paquete-sqldf-si-sabes-sql-sabes-r\/\" rel=\"nofollow\">Spanish language blog post<\/a> (<a href=\"http:\/\/translate.google.com\/translate?hl=en&amp;sl=es&amp;u=http:\/\/analisisydecision.es\/monografico-paquete-sqldf-si-sabes-sql-sabes-r\/&amp;prev=http:\/\/blogsearch.google.com\/blogsearch%3Fhl%3Den%26ie%3DUTF-8%26q%3Dsqldf%26lr%3D%26sa%3DN\" rel=\"nofollow\">English translation<\/a>) .<\/p>\n<p>March 12, 2010. <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2010-March\/231711.html\" rel=\"nofollow\">this link<\/a> has an sqldf example using SQLite and <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2010-March\/231712.html\" rel=\"nofollow\">this link<\/a> solves the same problem also using sqldf but this time with PostgreSQL making use of PostgreSQL&#8217;s windowing functions.<\/p>\n<p>February 13, 2010. New versions: <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">sqldf version 0.3-4<\/a>, and <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RH2\/index.html\" rel=\"nofollow\">RH2 version 0.1-2<\/a> (DBI\/RJDBC driver for <a href=\"http:\/\/www.h2databasec.com\/\" rel=\"nofollow\">H2 database<\/a>) have been uploaded to CRAN. Also a new package <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RpgSQL\/index.html\" rel=\"nofollow\">RpgSQL version 0.1-1<\/a> (DBI\/RJDBC driver for <a href=\"http:\/\/www.postgresql.org\/\" rel=\"nofollow\">PostgreSQL database<\/a>) has been uploaded to CRAN. The default action of sqldf (if <tt>sqldf<\/tt>&#8216;s <tt>drv=<\/tt> argument is not used and if the <tt>\"sqldf.driver\"<\/tt> global option is not used) is to use PostgreSQL if RpgSQL is loaded or H2 if RH2 is loaded or SQLite otherwise. The main change in sqldf is that all <a href=\"http:\/\/www.h2database.com\/html\/grammar.html\" rel=\"nofollow\">H2 statements<\/a> are now supported, not just those statements that return results. The packages should become accessible from the <a href=\"http:\/\/cran.r-project.org\/\" rel=\"nofollow\">CRAN main site<\/a> and the <a href=\"http:\/\/cran.r-project.org\/mirrors.html\" rel=\"nofollow\">mirrors<\/a> shortly.<\/p>\n<p>February 7, 2010. New versions of <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">sqldf version 0.3-3<\/a>, and <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RH2\/index.html\" rel=\"nofollow\">RH2 version 0.1-1<\/a> (R driver for H2 database) have been uploaded to CRAN. They are primarily bug fix versions. Notable bugs that were eliminated were associated with the use of the persistence feature (using sqldf without any arguments) and the use of the filter= argument.<\/p>\n<p>Feburary 6, 2010. Added example <a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_13._read.csv.sql_and_read.csv2.sql\">Example 13c<\/a> illustrating use of <tt>filter=<\/tt> argument with <tt>read.csv.sql<\/tt>.<\/p>\n<p>February 1, 2010. sqldf 0.3-2 is now in the svn repository and has been uploaded to CRAN. It now also supports the <a href=\"http:\/\/www.h2database.com\/\" rel=\"nofollow\">H2<\/a> embedded java database. This database has some <a href=\"http:\/\/www.h2database.com\/html\/functions.html\" rel=\"nofollow\">SQL functions<\/a> not available in SQLite. For more info see <a href=\"http:\/\/code.google.com\/p\/sqldf\/#10.__What_are_some_of_the_differences_between_using_SQLite_and_H\" rel=\"nofollow\">FAQ #10<\/a>.<\/p>\n<p>January 27, 2010. Added <a href=\"http:\/\/code.google.com\/p\/sqldf\/#9._How_do_I_examine_the_layout_that_SQLite_uses_for_a_table?\" rel=\"nofollow\">FAQ #9<\/a> on examining table layouts.<\/p>\n<p>January 26, 2010. Added <a href=\"http:\/\/code.google.com\/p\/sqldf\/#8._Why_am_I_having_problems_with_update?\" rel=\"nofollow\">FAQ #8<\/a> on update.<\/p>\n<p>January 24, 2010. Added <a href=\"http:\/\/code.google.com\/p\/sqldf\/#7._Why_are_there_messages_about_MySQL?\" rel=\"nofollow\">FAQ #7<\/a> on MySQL.<\/p>\n<p>January 22, 2010. Added <a href=\"http:\/\/code.google.com\/p\/sqldf\/#6._Why_are_there_problems_when_we_use_table_names_or_column_name\" rel=\"nofollow\">FAQ #6<\/a> on case sensitivity.<\/p>\n<p>January 15, 2010. sqldf listed in Drew Conway&#8217;s top 10 <a href=\"http:\/\/www.drewconway.com\/zia\/?p=1614\" rel=\"nofollow\">Must-Have R Packages for Social Scientists<\/a> in a December 2009 post on his Zero Intelligence Agents blog. sqldf was also mentioned in November in <a href=\"http:\/\/dataspora.com\/blog\/sql-is-dead-long-live-sql\/\" rel=\"nofollow\">dataspora<\/a> by Michael E. Driscoll and is the subject of a blog post in <a href=\"http:\/\/www.cerebralmastication.com\/2009\/11\/loading-big-data-into-r\/\" rel=\"nofollow\">Cerebral Mastication<\/a> by J. D. Long. sqldf is also recommended for a particular application in <a href=\"http:\/\/stackoverflow.com\/questions\/1169551\/sql-like-functionality-in-r\" rel=\"nofollow\">stackoverflow<\/a> and Juliet Jacobson discusses why it fits in with her work flow <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2010-January\/224579.html\" rel=\"nofollow\">here<\/a>. Also some recent tweets on sqldf can be found <a href=\"http:\/\/twitter.com\/ozjimbob\/status\/6479231902\" rel=\"nofollow\">here<\/a> and <a href=\"http:\/\/twitter.com\/zenogantner\/status\/2453139516\" rel=\"nofollow\">here<\/a>.<\/p>\n<p>December 28, 2009. New bug fix release <tt>sqldf 0.2-1<\/tt> on <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">CRAN<\/a>. See <a href=\"http:\/\/sqldf.googlecode.com\/svn\/trunk\/inst\/NEWS\" rel=\"nofollow\">NEWS file<\/a>.<\/p>\n<p>December 26, 2009. Folded the Bugs section into <a href=\"http:\/\/code.google.com\/p\/sqldf\/#4._How_does_sqldf_work_with_&quot;Date&quot;_class_variables?\" rel=\"nofollow\">FAQ #4<\/a> since this is more of an explanation of how to use dates in SQLite than a bug. That section has been further expanded to show how to use SQLite <a href=\"http:\/\/www.sqlite.org\/lang_datefunc.html\" rel=\"nofollow\">date and time functions<\/a> to solve some problems involving the R <tt>Date<\/tt> class.<\/p>\n<p>December 22, 2009. <tt>sqldf 0.2-0<\/tt> has been released and is available on <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">CRAN<\/a>. It now works with the latest version of <tt>DBI<\/tt>, <tt>DBI 0.2-5<\/tt> (which quotes column names that are SQL reserved words instead of appending <tt>__1<\/tt> to their name so the mangling of column names that are SQL reserved words is gone). Also <tt>sqldf 0.2-0<\/tt> supports the <tt>libspatial-1.dll<\/tt> SQLite loadable extension which gives the user access to several dozen new SQL functions listed here: <a href=\"http:\/\/www.gaia-gis.it\/spatialite\/spatialite-sql-2.3.1.html\" rel=\"nofollow\">http:\/\/www.gaia-gis.it\/spatialite\/spatialite-sql-2.3.1.html<\/a>. The user must download this dll and place it in their path if they want to use these functions. (If this is not done <tt>sqldf<\/tt> will still work but without those new functions.) Also new <tt>filter=<\/tt> arg on<tt>read.csv.sql<\/tt> and new <tt>read.csv2.sql<\/tt> command. For more details see this <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-packages\/2009\/001083.html\" rel=\"nofollow\">announcement<\/a> and the <a href=\"http:\/\/sqldf.googlecode.com\/svn\/trunk\/inst\/NEWS\" rel=\"nofollow\">NEWS file<\/a>.<\/p>\n<p>December 9, 2009. Titus von der Malsburg <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2009-December\/221456.html\" rel=\"nofollow\">posted on r-help<\/a> peformance results of a problem with about 8,000 rows comparing an <tt>sqldf<\/tt> solution to 4 other solutions using <tt>aggregate<\/tt>, <tt>summmaryBy<\/tt>, <tt>by<\/tt> and <tt>tapply<\/tt>, respectively, and found that the <tt>sqldf<\/tt> solution was the fastest. Marek Jared<a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2009-December\/221513.html\" rel=\"nofollow\">posted<\/a> a variation on the problem, which included making it self-contained, and reached the same conclusion. (Added later: there are also some performance results <a href=\"http:\/\/www.cerebralmastication.com\/2009\/11\/loading-big-data-into-r\/\" rel=\"nofollow\">here<\/a>.) Since <tt>sqldf<\/tt> must build a database, transfer data frames to it, perform the operations, transfer the result back and destroy the database it created we would not expect it to be the fastest possible solution nevertheless as these performance tests show it is remarkably good and in those cases was actually faster than anything else tried. (<i>Note:<\/i> if your queries are running slowly you can speed them up, sometimes dramatically, by using indexing and ensuring that the queries are specified in such a way that the created indexes are actually used. See example 4i on this page.)<\/p>\n<p>September 25, 2009. A new version of sqldf is on CRAN. It contains bug fixes and can also handle table names with a dot in the name provided the table name is enclosed in back quotes in the SQL statement.<\/p>\n<p>August 30, 2009. Added Example 4f temporal join to this page.<\/p>\n<p>June 16, 2009. Added <tt>read.csv2.sql<\/tt> to development version. It is like <tt>read.csv.sql<\/tt> except that <tt>sep<\/tt> defaults to &#8220;;&#8221; . See <a href=\"http:\/\/code.google.com\/p\/sqldf\/#Example_13._read.csv.sql\" rel=\"nofollow\">Example 13b<\/a> at the end of this page.<\/p>\n<p>June 7, 2009. Version 0.1-5 of <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">sqldf<\/a> is now on CRAN and should propagate to the mirrors shortly. <tt>read.csv.sql<\/tt> is new. See <a href=\"https:\/\/code.google.com\/p\/sqldf\/#Example_13._read.csv.sqlExample\">Example 13<\/a> below.<\/p>\n<p>June 4, 2009. New command <tt>read.csv.sql<\/tt>.<\/p>\n<p>May 16, 2009. Example 6g added below.<\/p>\n<p>April 22, 2009. Added example 4e (left join) in the Examples section below. <a href=\"http:\/\/code.google.com\/p\/sqldf\/#Example_4._Join\" rel=\"nofollow\">Example 4 section<\/a><\/p>\n<p>March 29, 2009. Added example 7c in the Examples section below. <a href=\"http:\/\/code.google.com\/p\/sqldf\/#Example_7._Nested_Select\" rel=\"nofollow\">Example 7 section<\/a><\/p>\n<p>March 25, 2009. Added to <a href=\"http:\/\/code.google.com\/p\/sqldf\/#3._Why_does_sqldf(%22select_var(x)_from_DF%22)_not_work?\" rel=\"nofollow\">FAQ<\/a> 3 showing how to use <tt>group_concat<\/tt> to apply <tt>R<\/tt> functions.<\/p>\n<p>March 17, 2009. Added Example 4d, temporal join, in Examples section below.<\/p>\n<p>February 20, 2009. Added Example 12. Combine two files in permanent database.<\/p>\n<p>February 5, 2009. Added to <a href=\"http:\/\/code.google.com\/p\/sqldf\/#2._Why_does_sqldf_seem_to_mangle_certain_variable_names?\" rel=\"nofollow\">FAQ 2<\/a> and created new <a href=\"http:\/\/code.google.com\/p\/sqldf\/#Example_11._Between_and_Alternatives\" rel=\"nofollow\">Example 11<\/a> thanks to Michael Rehberg.<\/p>\n<p>January 16, 2009. Added new FAQ section below and incorporated old Heuristic section into it as question 1.<\/p>\n<p>December 10, 2008. sqldf 0.1-4 uploaded to <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">CRAN<\/a><\/p>\n<p>November 19, 2008. Minor improvements to this web page.<\/p>\n<p>September 30, 2008. Added example 6f which shows how to work with files that have fixed columns widths (as opposed to the fields being delimited).<\/p>\n<p>June 17, 2008. Added persistent connections to sqldf. It allows one to write this: <tt>sqldf(); sqldf(s1); sqldf(s2); sqldf()<\/tt> where <tt>s1<\/tt> and <tt>s2<\/tt>are character strings containing SQL statements. The first and last <tt>sqldf<\/tt> statements with no args open and close a connection and the middle two use it implicitly. There are also facilities to explicitly reference the connection so that <tt>sqldf<\/tt> and <tt>RSQLite<\/tt> calls can be intermixed. See Examples 10a and 10b below &#8212; which are new.<\/p>\n<p>June 16, 2008. Added Example 9 below.<\/p>\n<p>April 18, 2008. Updated section below on the <tt>sqldf<\/tt> heuristic.<\/p>\n<p>April 14, 2008. New section on the Heuristic <tt>sqldf<\/tt> uses further down on this page.<\/p>\n<p>January 29, 2008. New Example 8 below was added.<\/p>\n<p>November 16, 2007. Added Example 7b below. This shows a query that is similar to 7a but in the context of time series.<\/p>\n<p>October 28, 2007. Added Example 7 below showing a complex query.<\/p>\n<p>October 12, 2007. Added Example 6e showing how to read a random set of rows from a file without reading the entire file into R.<\/p>\n<p>August 29, 2007. Expanded Example 6 below.<\/p>\n<p>August 11, 2007. Changes in the <a href=\"http:\/\/code.google.com\/p\/sqldf\/source\" rel=\"nofollow\">development version of sqldf<\/a> are that the sql argument, <tt>x<\/tt> can now be a vector with one component per sql command. Each will be executed in turn and result of last one returned.<\/p>\n<p>August 7, 2007. Changes in the <a href=\"http:\/\/code.google.com\/p\/sqldf\/source\" rel=\"nofollow\">development version of sqldf<\/a> are:<\/p>\n<ul>\n<li>supports reading large input files straight to the database (as opposed to reading them into R and then writing them to the database). See Example 6 below (which is also at end of <a href=\"http:\/\/sqldf.googlecode.com\/svn\/trunk\/man\/sqldf.Rd\" rel=\"nofollow\">sqldf.Rd<\/a>).<\/li>\n<li>argument list has been modified somewhat (although the most common usage is still only to specify a single argument, the SQL select statement) and<\/li>\n<li>it has been partially tested with MySQL (previously only SQLite).<\/li>\n<\/ul>\n<p>July 31, 2007. sqldf 0.1-1 (replacing sqldf 0.1-0) is on <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">CRAN<\/a>. See <a href=\"http:\/\/sqldf.googlecode.com\/svn\/trunk\/inst\/NEWS\" rel=\"nofollow\">NEWS file<\/a> for changes.<\/p>\n<h1><a name=\"Troubleshooting\"><\/a>Troubleshooting<\/h1>\n<p>sqldf has been <a href=\"http:\/\/cran.r-project.org\/web\/checks\/check_results_sqldf.html\" rel=\"nofollow\">extensively<\/a> <a href=\"http:\/\/code.google.com\/p\/sqldf\/source\/browse\/trunk\/inst\/unitTests\/runit.all.R\" rel=\"nofollow\">tested<\/a> with multiple architectures and database back ends but there are no guarantees.<\/p>\n<h2><a name=\"Problem_is_that_installer_gives_message_that_sqldf_is_not_availa\"><\/a>Problem is that installer gives message that sqldf is not available<\/h2>\n<p>See <a href=\"http:\/\/stackoverflow.com\/questions\/27772756\/sqldf-doesnt-install-on-ubuntu-14-04\" rel=\"nofollow\">http:\/\/stackoverflow.com\/questions\/27772756\/sqldf-doesnt-install-on-ubuntu-14-04<\/a><\/p>\n<h2><a name=\"Problem_with_no_argument_form_of_sqldf_-_sqldf()\"><\/a>Problem with no argument form of sqldf &#8211; sqldf()<\/h2>\n<p>The no argument form, i.e. <tt>sqldf()<\/tt> is used for opening and closing a connection so that intermediate sqldf statements can all use the same connection. If you have forgotten whether the last <tt>sqldf()<\/tt> opened or closed the connection this code will close it if it is open and otherwise do nothing:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">\u00a0 \u00a0<\/span><span class=\"com\"># close an old connection if it exists<\/span><span class=\"pln\">\n\u00a0 \u00a0<\/span><span class=\"kwd\">if<\/span> <span class=\"pun\">(!<\/span><span class=\"kwd\">is<\/span><span class=\"pun\">.<\/span><span class=\"kwd\">null<\/span><span class=\"pun\">(<\/span><span class=\"pln\">getOption<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"sqldf.connection\"<\/span><span class=\"pun\">)))<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">()<\/span><\/pre>\n<p>Thanks to Chris Davis <a href=\"https:\/\/groups.google.com\/d\/msg\/sqldf\/-YAvaJnlRrY\/7nF8tpBnrcAJ\" rel=\"nofollow\">https:\/\/groups.google.com\/d\/msg\/sqldf\/-YAvaJnlRrY\/7nF8tpBnrcAJ<\/a> for pointing this out.<\/p>\n<h2><a name=\"Problem_involvling_tcltk\"><\/a>Problem involvling tcltk<\/h2>\n<p>The most common problem is that the tcltk package and tcl\/tk itself are missing. Historically these were bundled with the Windows version of R so Windows users should not experience any problems on this account. Since R version 3.0.0 Mac versions of R also have the tcltk package and Tcl\/Tk itself bundled so if you are having a problem on the Mac you may only need to upgrade to the latest version of R. If upgrading to the latest version of R does not help then using this line will usually allow it to work even without the tcltk package and tcl\/tk itself:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">options<\/span><span class=\"pun\">(<\/span><span class=\"pln\">gsubfn<\/span><span class=\"pun\">.<\/span><span class=\"pln\">engine <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"R\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>Running the above <tt>options<\/tt> line before using <tt>sqldf<\/tt>, e.g. put that options line in your <tt>.Rprofile<\/tt>, is all that is needed to get sqldf to work without the tcltk package and tcl\/tk itself in most cases; however, this does have the downside that it will use the R engine which is slower. An alternative, is to rebuild R yourself as discussed here: <a href=\"http:\/\/permalink.gmane.org\/gmane.comp.lang.r.fedora\/235\" rel=\"nofollow\">http:\/\/permalink.gmane.org\/gmane.comp.lang.r.fedora\/235<\/a><\/p>\n<p>If the above does not resolve the problem then read the more detailed discussion below.<\/p>\n<p>A related problem is that your R installation is flawed or incomplete in some way and the main way to fix thiat is to fix your installation of R. This will not only affect sqldf but also many other R packages so information on installing them can also help here. In particular <a href=\"http:\/\/socserv.socsci.mcmaster.ca\/jfox\/Misc\/Rcmdr\/installation-notes.html\" rel=\"nofollow\">installation information for the Rcmdr package<\/a> may be useful since its likely that if you can install Rcmdr then you can also install sqldf.<\/p>\n<ul>\n<li>sqldf uses the gsubfn R package which normally uses the tcltk R package which in turn uses tcl\/tk itself. The tcltk package is a core component of R so a complete distribution of R should have tcltk capability. For this to happen tcl\/tk <strong>must<\/strong> be present at the time <strong>R itself was built<\/strong> (the build process automatically excludes tcltk capability if it does not sense that tcl\/tk is present at the time R itself is built) but it is possible to run gsubfn and therefore also sqldf without tcl\/tk present at the time sqldf runs (although it will run slower if you do this). There are three possibilities: (1) <strong>tcltk capability absent<\/strong>. If this command from within R <tt>capabilities()[[\"tcltk\"]]<\/tt> is <tt>FALSE<\/tt> then your distribution of R was built without tcltk capability. In that case you <strong>must<\/strong> use a different distribution of R. All common distributions of R including the CRAN distribution for Windows and most distributions for Linux do have tcltk capability. Note that a given version of R may have been built with or without tcltk capability so simply checking which version of R you have won&#8217;t tell you whether your distribution was built correctly. This situation mostly affects distributions of R built by the user or improperly built by others and then distributed. (2) <strong>tcl\/tk missing on system<\/strong> (a) If your distribution of R was built with tcltk capaility as described in the last point but you don&#8217;t have tcl\/tk itself on your system you can simply install tcl\/tk yourself. In most cases this is actually quite easy to do &#8212; its typically a one line apt-get on Linux. There is information about installing tcl\/tk near the end of <a href=\"https:\/\/code.google.com\/p\/sqldf\/#5._I_get_a_message_about_the_tcltk_package_being_missing.\">FAQ #5<\/a> or (b) if your distribution of R was built with tcltk capability as described in the first point but you don&#8217;t have tcl\/tk on your system and you don&#8217;t want to bother to install it then issue the R command:<\/li>\n<\/ul>\n<pre class=\"prettyprint\"><span class=\"pln\">options<\/span><span class=\"pun\">(<\/span><span class=\"pln\">gsubfn<\/span><span class=\"pun\">.<\/span><span class=\"pln\">engine <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"R\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>In that case gusbfn will use the slower R engine instead of the faster tcltk engine so you won&#8217;t need tcl\/tk installed on your system in the first place. Be sure you are using gsubfn 0.6-4 or later if you use this option since prior versions of gsubfn had a bug which could interfere with the use of this option. To check your version of gsubfn:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">packageVersion<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"gsubfn\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<ul>\n<li>using an old version of R, sqldf or some other software. If that is the problem upgrade to the most recent versions <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/index.html\" rel=\"nofollow\">on CRAN<\/a>. Also be sure you are using the latest versions of other packages used by sqldf. If you are getting NAMESPACE errors then this is likely the problem. You can find the current version of R <a href=\"http:\/\/cran.r-project.org\/mirrors.html\" rel=\"nofollow\">here<\/a> and then install sqldf from within R using <tt>install.packages(\"sqldf\")<\/tt> . If you already have the current version of R and have installed the packages you want then you can update your installed packages to the current version by entering this in R: <tt>update.packages()<\/tt> . In most cases all the mirrors are up to date but if that should fail to update to the most recent packages on CRAN then try using a more up to date mirror.<\/li>\n<\/ul>\n<ul>\n<li>unexpected errors concerning H2, MySQL or PostgreSQL. sqldf automatically uses H2, MySQL or PostgreSQL if the R package RH2, RMySQL or RpgSQL is loaded, respectively. If none of them are loaded it uses sqlite. To force it to use sqlite even though one of those others is loaded (1) add the <tt>drv = \"SQLite\"<\/tt> argument to each sqldf call or (2) issue the R command:<\/li>\n<\/ul>\n<pre class=\"prettyprint\"><span class=\"pln\">options<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">.<\/span><span class=\"pln\">driver <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"SQLite\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>in which case all sqldf calls will use sqlite. See <a href=\"https:\/\/code.google.com\/p\/sqldf\/#7._Why_are_there_messages_about_MySQL?\">FAQ #7<\/a> for more info.<\/p>\n<ul>\n<li>message about tcltk being missing or other tcltk problem. This is really the same problem discussed in the first point above. Upgrade to sqldf 0.4-5 or later. If it still persists then set this option: <tt>options(gsubfn.engine = \"R\")<\/tt> which causes R code to be substituted for the tcl code or else just install the tcltk package. See <a href=\"https:\/\/code.google.com\/p\/sqldf\/#5._I_get_a_message_about_the_tcltk_package_being_missing.\">FAQ #5<\/a> for more info. If you installed the tcltk package and it still has problems then remove the tcltk package and try these steps again.<\/li>\n<\/ul>\n<ul>\n<li>error messages regarding a data frame that has a dot in its name. The dot is an SQL operator. Either quote the name appropriately or change the name of the data frame to one without a dot.<\/li>\n<\/ul>\n<ul>\n<li>as recommended in the <a href=\"http:\/\/cran.r-project.org\/web\/packages\/sqldf\/INSTALL\" rel=\"nofollow\">INSTALL<\/a> file its better to install sqldf using <tt>install.packages(\"sqldf\")<\/tt> and <strong>not<\/strong> <tt>install.packages(\"sqldf\", dep = TRUE)<\/tt> since the latter will try to pull in every R database driver package supported by sqldf which increases the likelihood of a problem with installation. Its unlikely that you need every database that sqldf supports so doing this is really asking for trouble. The recommended way does install sqlite automatically anyways and if you want any of the additional ones just install them separately.<\/li>\n<\/ul>\n<ul>\n<li>Mac users. According to <a href=\"http:\/\/cran.us.r-project.org\/bin\/macosx\/tools\/\" rel=\"nofollow\">http:\/\/cran.us.r-project.org\/bin\/macosx\/tools\/<\/a> Tcl\/Tk comes with R 3.0.0 and later but if you are using an earlier version of R look at <a href=\"http:\/\/r.789695.n4.nabble.com\/sqldf-hanging-on-macintosh-works-on-windows-tt3022193.html#a3022397\" rel=\"nofollow\">this link<\/a> .<\/li>\n<\/ul>\n<h1><a name=\"FAQ\"><\/a>FAQ<\/h1>\n<h2><a name=\"1._How_does_sqldf_handle_classes_and_factors?\"><\/a>1. How does sqldf handle classes and factors?<\/h2>\n<p><tt>sqldf<\/tt> uses a heuristic to assign classes and factor levels to returned results. It checks each column name returned against the column names in the input data frames and if the output column name matches any input column name then it assigns the input class to the output. If two input data frames have the same column names then this automatic assignment is disabled if they differ in class. Also if <tt>method = \"raw\"<\/tt> then the automatic class assignment is disabled. This also extends to factor levels as well so that if an output column corresponds to an input column that is of class &#8220;factor&#8221; then the factor levels of the input column are assigned to the output column (again assuming that only one input column has the output column name). Also in the case of factors the levels of the output must appear among the levels of the input.<\/p>\n<p>sqldf knows about Date, POSIXct and chron (dates, times) classes but not POSIXlt and other date and time classes.<\/p>\n<p>Previously this section had an example of how the heuristic could go awry but improvements in the heuristic in sqldf 0.4-0 are such that that example now works as expected.<\/p>\n<h2><a name=\"2._Why_does_sqldf_seem_to_mangle_certain_variable_names?\"><\/a>2. Why does sqldf seem to mangle certain variable names?<\/h2>\n<p>Staring with RSQLite 1.0.0 and sqldf 0.4-9 dots in column names are no longer translated to underscores.<\/p>\n<p>If you are using an older version of these packages then note that since dot is an SQL operator the RSQLite driver package converts dots to underscores so that SQL statements can reference such columns unquoted.<\/p>\n<p>Also note that certain names are SQL keywords. These can be found using this code:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">.<\/span><span class=\"pln\">SQL92Keywords<\/span><\/pre>\n<p>Note that using such names can sometimes result in an error message such as:<\/p>\n<pre class=\"prettyprint\"><span class=\"typ\">Error<\/span> <span class=\"kwd\">in<\/span><span class=\"pln\"> sqliteExecStatement<\/span><span class=\"pun\">(<\/span><span class=\"pln\">con<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> statement<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> bind<\/span><span class=\"pun\">.<\/span><span class=\"pln\">data<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">:<\/span><span class=\"pln\">\n\u00a0RS<\/span><span class=\"pun\">-<\/span><span class=\"pln\">DBI driver<\/span><span class=\"pun\">:<\/span> <span class=\"pun\">(<\/span><span class=\"pln\">error <\/span><span class=\"kwd\">in<\/span><span class=\"pln\"> statement<\/span><span class=\"pun\">:<\/span> <span class=\"kwd\">no<\/span><span class=\"pln\"> such column<\/span><span class=\"pun\">:<\/span> <span class=\"pun\">...)<\/span><\/pre>\n<p>which appears to suggest that there is no column but that is because it has a different name than expected. For an example of what happens:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># this only applies to old versions of sqldf and DBI<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># based on example by Adrian Dragulescu<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">index<\/span><span class=\"pun\">=<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">12<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> date<\/span><span class=\"pun\">=<\/span><span class=\"pln\">rep<\/span><span class=\"pun\">(<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Sys<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Date<\/span><span class=\"pun\">()-<\/span><span class=\"lit\">1<\/span><span class=\"pun\">,<\/span> <span class=\"typ\">Sys<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Date<\/span><span class=\"pun\">()),<\/span> <span class=\"lit\">6<\/span><span class=\"pun\">),<\/span>\n<span class=\"pun\">+<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"kwd\">group<\/span><span class=\"pun\">=<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"A\"<\/span><span class=\"pun\">,<\/span><span class=\"str\">\"B\"<\/span><span class=\"pun\">,<\/span><span class=\"str\">\"C\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> value<\/span><span class=\"pun\">=<\/span><span class=\"pln\">round<\/span><span class=\"pun\">(<\/span><span class=\"pln\">rnorm<\/span><span class=\"pun\">(<\/span><span class=\"lit\">12<\/span><span class=\"pun\">),<\/span><span class=\"lit\">2<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from DF\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 index date <\/span><span class=\"kwd\">group<\/span><span class=\"pln\"> value\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">1<\/span> <span class=\"lit\">14259.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0A \u00a0 \u00a0<\/span><span class=\"pun\">-<\/span><span class=\"lit\">0.24<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2<\/span> <span class=\"lit\">14260.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0B \u00a0 \u00a0 <\/span><span class=\"lit\">0.16<\/span>\n<span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3<\/span> <span class=\"lit\">14259.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0C \u00a0 \u00a0 <\/span><span class=\"lit\">1.24<\/span>\n<span class=\"lit\">4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">4<\/span> <span class=\"lit\">14260.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0A \u00a0 \u00a0<\/span><span class=\"pun\">-<\/span><span class=\"lit\">1.16<\/span>\n<span class=\"lit\">5<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">5<\/span> <span class=\"lit\">14259.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0B \u00a0 \u00a0<\/span><span class=\"pun\">-<\/span><span class=\"lit\">0.19<\/span>\n<span class=\"lit\">6<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">6<\/span> <span class=\"lit\">14260.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0C \u00a0 \u00a0 <\/span><span class=\"lit\">0.65<\/span>\n<span class=\"lit\">7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">7<\/span> <span class=\"lit\">14259.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0A \u00a0 \u00a0<\/span><span class=\"pun\">-<\/span><span class=\"lit\">1.24<\/span>\n<span class=\"lit\">8<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">8<\/span> <span class=\"lit\">14260.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0B \u00a0 \u00a0<\/span><span class=\"pun\">-<\/span><span class=\"lit\">0.34<\/span>\n<span class=\"lit\">9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">9<\/span> <span class=\"lit\">14259.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0C \u00a0 \u00a0<\/span><span class=\"pun\">-<\/span><span class=\"lit\">0.27<\/span>\n<span class=\"lit\">10<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">10<\/span> <span class=\"lit\">14260.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0A \u00a0 \u00a0<\/span><span class=\"pun\">-<\/span><span class=\"lit\">0.18<\/span>\n<span class=\"lit\">11<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">11<\/span> <span class=\"lit\">14259.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0B \u00a0 \u00a0 <\/span><span class=\"lit\">0.57<\/span>\n<span class=\"lit\">12<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">12<\/span> <span class=\"lit\">14260.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0C \u00a0 \u00a0<\/span><span class=\"pun\">-<\/span><span class=\"lit\">0.83<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> intersect<\/span><span class=\"pun\">(<\/span><span class=\"pln\">names<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> tolower<\/span><span class=\"pun\">(.<\/span><span class=\"pln\">SQL92Keywords<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]<\/span> <span class=\"str\">\"index\"<\/span> <span class=\"str\">\"date\"<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"str\">\"group\"<\/span> <span class=\"str\">\"value\"<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> DF\n<\/span><span class=\"pun\">&gt;<\/span> <span class=\"com\"># change column names to i, d, g and v<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> names<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF2<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> substr<\/span><span class=\"pun\">(<\/span><span class=\"pln\">names<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">),<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from DF2\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 i \u00a0 \u00a0 \u00a0 \u00a0 \u00a0d g \u00a0 \u00a0 v\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">1<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">16<\/span><span class=\"pln\"> A \u00a0<\/span><span class=\"lit\">0.35<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">2<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">17<\/span><span class=\"pln\"> B <\/span><span class=\"pun\">-<\/span><span class=\"lit\">0.96<\/span>\n<span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">3<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">16<\/span><span class=\"pln\"> C \u00a0<\/span><span class=\"lit\">0.76<\/span>\n<span class=\"lit\">4<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">4<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">17<\/span><span class=\"pln\"> A \u00a0<\/span><span class=\"lit\">0.07<\/span>\n<span class=\"lit\">5<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">5<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">16<\/span><span class=\"pln\"> B \u00a0<\/span><span class=\"lit\">0.03<\/span>\n<span class=\"lit\">6<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">6<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">17<\/span><span class=\"pln\"> C \u00a0<\/span><span class=\"lit\">0.19<\/span>\n<span class=\"lit\">7<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">7<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">16<\/span><span class=\"pln\"> A <\/span><span class=\"pun\">-<\/span><span class=\"lit\">2.03<\/span>\n<span class=\"lit\">8<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">8<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">17<\/span><span class=\"pln\"> B \u00a0<\/span><span class=\"lit\">0.98<\/span>\n<span class=\"lit\">9<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">9<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">16<\/span><span class=\"pln\"> C <\/span><span class=\"pun\">-<\/span><span class=\"lit\">1.21<\/span>\n<span class=\"lit\">10<\/span> <span class=\"lit\">10<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">17<\/span><span class=\"pln\"> A <\/span><span class=\"pun\">-<\/span><span class=\"lit\">0.67<\/span>\n<span class=\"lit\">11<\/span> <span class=\"lit\">11<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">16<\/span><span class=\"pln\"> B \u00a0<\/span><span class=\"lit\">2.49<\/span>\n<span class=\"lit\">12<\/span> <span class=\"lit\">12<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">17<\/span><span class=\"pln\"> C <\/span><span class=\"pun\">-<\/span><span class=\"lit\">0.63<\/span><\/pre>\n<h2><a name=\"3._Why_does_sqldf(&quot;select_var(x)_from_DF&quot;)_not_work?\"><\/a>3. Why does sqldf(&#8220;select var(x) from DF&#8221;) not work?<\/h2>\n<p>The SQL statement passed to sqldf must be a valid SQL statement understood by the database. The functions that are understood include simple SQLite functions and aggregate SQLite functions and functions in the <a href=\"http:\/\/code.google.com\/p\/sqldf\/#Example_15._Use_of_RSQLite.extfuns_library_functions\" rel=\"nofollow\">RSQLite.extfuns<\/a> package. Thus in this case in place of var(x) one could use variance(x) from the RSQLite.extfuns package. For SQLite functions see the lists of <a href=\"http:\/\/www.sqlite.org\/lang_corefunc.html\" rel=\"nofollow\">core functions<\/a>, <a href=\"http:\/\/www.sqlite.org\/lang_aggfunc.html\" rel=\"nofollow\">aggregate functions<\/a> and <a href=\"http:\/\/www.sqlite.org\/lang_datefunc.html\" rel=\"nofollow\">date and time functions<\/a>.<\/p>\n<p>If each group is not too large we can use group_concat to return all group members and then later use <tt>apply<\/tt> in <tt>R<\/tt> to use R functions to aggregate results. For example, in the following we summarize the data using <tt>sqldf<\/tt> and then <tt>apply<\/tt> a function based on <tt>var<\/tt>:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">8<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> g <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> gl<\/span><span class=\"pun\">(<\/span><span class=\"lit\">2<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">4<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"kwd\">out<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select group_concat(a) groupa from DF group by g\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"kwd\">out<\/span><span class=\"pln\">\n\u00a0 \u00a0groupa\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">,<\/span><span class=\"lit\">2<\/span><span class=\"pun\">,<\/span><span class=\"lit\">3<\/span><span class=\"pun\">,<\/span><span class=\"lit\">4<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">5<\/span><span class=\"pun\">,<\/span><span class=\"lit\">6<\/span><span class=\"pun\">,<\/span><span class=\"lit\">7<\/span><span class=\"pun\">,<\/span><span class=\"lit\">8<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"kwd\">out<\/span><span class=\"pln\">$var <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> apply<\/span><span class=\"pun\">(<\/span><span class=\"kwd\">out<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">,<\/span> <span class=\"kwd\">function<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">)<\/span> <span class=\"kwd\">var<\/span><span class=\"pun\">(<\/span><span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">numeric<\/span><span class=\"pun\">(<\/span><span class=\"pln\">strsplit<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">)[[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]])))<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"kwd\">out<\/span><span class=\"pln\">\n\u00a0 \u00a0groupa \u00a0 \u00a0 \u00a0<\/span><span class=\"kwd\">var<\/span>\n<span class=\"lit\">1<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">,<\/span><span class=\"lit\">2<\/span><span class=\"pun\">,<\/span><span class=\"lit\">3<\/span><span class=\"pun\">,<\/span><span class=\"lit\">4<\/span> <span class=\"lit\">1.666667<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">5<\/span><span class=\"pun\">,<\/span><span class=\"lit\">6<\/span><span class=\"pun\">,<\/span><span class=\"lit\">7<\/span><span class=\"pun\">,<\/span><span class=\"lit\">8<\/span> <span class=\"lit\">1.666667<\/span><\/pre>\n<h2><a name=\"4._How_does_sqldf_work_with_&quot;Date&quot;_class_variables?\"><\/a>4. How does sqldf work with &#8220;Date&#8221; class variables?<\/h2>\n<p>The H2 database has specific support for Date class variables so with H2 Date class variables work as expected:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">RH2<\/span><span class=\"pun\">)<\/span> <span class=\"com\"># driver support for dates was added in RH2 version 0.1-2<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> test1 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sale_date <\/span><span class=\"pun\">=<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Date<\/span><span class=\"pun\">(<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"2008-08-01\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"2031-01-09\"<\/span><span class=\"pun\">,<\/span>\n<span class=\"pun\">+<\/span> <span class=\"str\">\"1990-01-03\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"2007-02-03\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"1997-01-03\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"2004-02-04\"<\/span><span class=\"pun\">)))<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">numeric<\/span><span class=\"pun\">(<\/span><span class=\"pln\">test1<\/span><span class=\"pun\">[[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]])<\/span>\n<span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]<\/span> <span class=\"lit\">14092<\/span> <span class=\"lit\">22288<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">7307<\/span> <span class=\"lit\">13547<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">9864<\/span> <span class=\"lit\">12452<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select MAX(sale_date) from test1\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 MAX<\/span><span class=\"pun\">..<\/span><span class=\"pln\">sale_date<\/span><span class=\"pun\">..<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2031<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">09<\/span><\/pre>\n<p>In R, <tt>Date<\/tt> class dates are stored internally as the number of days since 1970-01-01 &#8212; often referred to as the UNIX Epoch. (They are stored this way on non-UNIX platforms as well.) When the dates are transferred to SQLite they are stored as these numbers in SQLite. (sqldf has a heuristic that attempts to ascertain whether the column represents a Date but if it cannot ascertain this then it returns the numeric internal version.)<\/p>\n<p>In SQLite this is what happens:<\/p>\n<p>The examples below use RSQLite 0.11-0 (prior to that version they would return wrong answers. With RSQLite it will return the correct answer but Date class columns will be returned as numeric if sqldf&#8217;s heuristic cannot automatically determine if they are to be of class <tt>\"Date\"<\/tt>. If you name the output column the same name as an input column which has <tt>\"Date\"<\/tt> class then it will correctly infer that the output is to be of class<tt>\"Date\"<\/tt> as well.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> test1 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sale_date <\/span><span class=\"pun\">=<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Date<\/span><span class=\"pun\">(<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"2008-08-01\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"2031-01-09\"<\/span><span class=\"pun\">,<\/span>\n<span class=\"pun\">+<\/span> <span class=\"str\">\"1990-01-03\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"2007-02-03\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"1997-01-03\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"2004-02-04\"<\/span><span class=\"pun\">)))<\/span>\n\n<span class=\"pun\">&gt;<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">numeric<\/span><span class=\"pun\">(<\/span><span class=\"pln\">test1<\/span><span class=\"pun\">[[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]])<\/span>\n<span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]<\/span> <span class=\"lit\">14092<\/span> <span class=\"lit\">22288<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">7307<\/span> <span class=\"lit\">13547<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">9864<\/span> <span class=\"lit\">12452<\/span>\n\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># correct except that it returns the numeric internal representation<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> dd <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select max(sale_date) from test1\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> dd\n\u00a0 max<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sale_date<\/span><span class=\"pun\">)<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">22288<\/span>\n\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># fix it up<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> dd<\/span><span class=\"pun\">[[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]]<\/span> <span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Date<\/span><span class=\"pun\">(<\/span><span class=\"pln\">dd<\/span><span class=\"pun\">[[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]],<\/span> <span class=\"str\">\"1970-01-01\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> dd\n\u00a0 max<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sale_date<\/span><span class=\"pun\">)<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 <\/span><span class=\"lit\">2031<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">09<\/span>\n\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># even better it returns Date class if we name column same as a Date class input column<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select max(sale_date) sale_date from test1\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0sale_date\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">2031<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">09<\/span><\/pre>\n<p>Also note this code:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span> <span class=\"typ\">Sys<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Date<\/span><span class=\"pun\">()<\/span> <span class=\"pun\">+<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">5<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> b <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">5<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 a b\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">07<\/span><span class=\"pun\">-<\/span><span class=\"lit\">31<\/span> <span class=\"lit\">1<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2<\/span>\n<span class=\"lit\">3<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span> <span class=\"lit\">3<\/span>\n<span class=\"lit\">4<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span> <span class=\"lit\">4<\/span>\n<span class=\"lit\">5<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span> <span class=\"lit\">5<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"typ\">Sys<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Date<\/span><span class=\"pun\">()<\/span> <span class=\"pun\">+<\/span> <span class=\"lit\">2<\/span>\n<span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]<\/span> <span class=\"str\">\"2009-08-01\"<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> s <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sprintf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from DF where a &gt;= %d\"<\/span><span class=\"pun\">,<\/span> <span class=\"typ\">Sys<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Date<\/span><span class=\"pun\">()<\/span> <span class=\"pun\">+<\/span> <span class=\"lit\">2<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> s\n<\/span><span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]<\/span> <span class=\"str\">\"select * from DF where a &gt;= 14457\"<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"pln\">s<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 a b\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span> <span class=\"lit\">3<\/span>\n<span class=\"lit\">3<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span> <span class=\"lit\">4<\/span>\n<span class=\"lit\">4<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span> <span class=\"lit\">5<\/span>\n\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># to compare against character string store a as character<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> transform<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> a <\/span><span class=\"pun\">=<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">character<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from DF2 where a &gt;= '2009-08-01'\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 a b\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span> <span class=\"lit\">3<\/span>\n<span class=\"lit\">3<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span> <span class=\"lit\">4<\/span>\n<span class=\"lit\">4<\/span> <span class=\"lit\">2009<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span> <span class=\"lit\">5<\/span><\/pre>\n<p>See <a href=\"http:\/\/www.sqlite.org\/lang_datefunc.html\" rel=\"nofollow\">date and time functions<\/a> for more information. An example using times but not dates can be found <a href=\"http:\/\/stackoverflow.com\/questions\/8185201\/merge-records-over-time-interval\/8187602#8187602\" rel=\"nofollow\">here<\/a> and some discussion on using POSIXct can be found <a href=\"https:\/\/groups.google.com\/d\/msg\/sqldf\/N-Xci-eKy3Y\/faLa1siY6xYJ\" rel=\"nofollow\">here<\/a> .<\/p>\n<h2><a name=\"5._I_get_a_message_about_the_tcltk_package_being_missing.\"><\/a>5. I get a message about the tcltk package being missing.<\/h2>\n<p>The sqldf package uses the gsubfn package for parsing and the gsubfn package optionally uses the tcltk R package which in turn uses string processing language, tcl, internally.<\/p>\n<p>If you are getting erorrs about the tcltk R package being missing or about tcl\/tk itself being missing then:<\/p>\n<p>Windows. This should not occur on Windows with the standard distributions of R. If it does you likely have a version of R that was built improperly and you will have to get a complete properly built version of R that was built to work with tcltk and tcl\/tk and includes tcl\/tk itself.<\/p>\n<p>Mac. This should not occur on <strong>recent<\/strong> versions of R on Mac. If it does occur upgrade your R installation to a recent version. If you must use an older version of R on the Mac then get tcl\/tk here: <a href=\"http:\/\/cran.us.r-project.org\/bin\/macosx\/tools\/\" rel=\"nofollow\">http:\/\/cran.us.r-project.org\/bin\/macosx\/tools\/<\/a><\/p>\n<p>UNIX\/Linux. If you don&#8217;t already have tcl\/tk itself on your system try this to install it like this (thanks to Eric Iversion):<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">sudo apt<\/span><span class=\"pun\">-<\/span><span class=\"kwd\">get<\/span><span class=\"pln\"> install tck<\/span><span class=\"pun\">-<\/span><span class=\"pln\">dev tk<\/span><span class=\"pun\">-<\/span><span class=\"pln\">dev<\/span><\/pre>\n<p>Also see this message by Rolf Turner: <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2011-April\/274424.html\" rel=\"nofollow\">https:\/\/stat.ethz.ch\/pipermail\/r-help\/2011-April\/274424.html<\/a>.<\/p>\n<p>In some cases it may be possible to bypass the need for tcltk and tcl\/tk altogether by running this command before you run sqldf:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">options<\/span><span class=\"pun\">(<\/span><span class=\"pln\">gsubfn<\/span><span class=\"pun\">.<\/span><span class=\"pln\">engine <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"R\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>In that case the gsubfn package will use alternate R code instead of tcltk (however, it will be slightly slower).<\/p>\n<p>Notes: sqldf depends on gsubfn for parsing and gsubfn optionally uses the tcltk R package (tcl is a string processing language) which is supposed to be included in every R installation. The tcltk R package relies on tcl\/tk itself which is included in all standard distributions of R on Windows on <strong>recent<\/strong> Mac distributions of R. Many Linux distributions include tcl\/tk itself right in the Linux distribution itself.<\/p>\n<p>Also note that whatever build of R you are using must have had tcl\/tk present at the time R was built (not just at the time its used) or else the R build process will automatically turn off tcltk capability within R. If that is the case supplying tcltk and tcl\/tk later won&#8217;t help. You must use a build of R that has tcltk capability built in. (If the R was built with tcltk capability then adding the tcltk package (if its missing) and tcl\/tk will work.)<\/p>\n<h2><a name=\"6._Why_are_there_problems_when_we_use_table_names_or_column_name\"><\/a>6. Why are there problems when we use table names or column names that are the same except for case?<\/h2>\n<p>SQL is case insensitive so table names <tt>a<\/tt> and <tt>A<\/tt> are the same as far as SQLite is concerned. Note that in the example below it did produce a warning that something is wrong although that might not be the case in all situations.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> a <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">2<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> A <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">y <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">11<\/span><span class=\"pun\">:<\/span><span class=\"lit\">12<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from a a1, A a2\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 x x\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">1<\/span> <span class=\"lit\">1<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">1<\/span> <span class=\"lit\">1<\/span>\n<span class=\"lit\">3<\/span> <span class=\"lit\">2<\/span> <span class=\"lit\">2<\/span>\n<span class=\"lit\">4<\/span> <span class=\"lit\">2<\/span> <span class=\"lit\">2<\/span>\n<span class=\"typ\">Warning<\/span><span class=\"pln\"> message<\/span><span class=\"pun\">:<\/span>\n<span class=\"typ\">In<\/span><span class=\"pln\"> value<\/span><span class=\"pun\">[[<\/span><span class=\"lit\">3L<\/span><span class=\"pun\">]](<\/span><span class=\"pln\">cond<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">:<\/span><span class=\"pln\">\n\u00a0 RS<\/span><span class=\"pun\">-<\/span><span class=\"pln\">DBI driver<\/span><span class=\"pun\">:<\/span> <span class=\"pun\">(<\/span><span class=\"pln\">error <\/span><span class=\"kwd\">in<\/span><span class=\"pln\"> statement<\/span><span class=\"pun\">:<\/span><span class=\"pln\"> table <\/span><span class=\"str\">`A`<\/span><span class=\"pln\"> already exists<\/span><span class=\"pun\">)<\/span><\/pre>\n<h2><a name=\"7._Why_are_there_messages_about_MySQL?\"><\/a>7. Why are there messages about MySQL?<\/h2>\n<p>sqldf can use several different databases. The database is specified in the <tt>drv=<\/tt> argument to the <tt>sqldf<\/tt> function. If <tt>drv=<\/tt> is not specified then it uses the value of the <tt>\"sqldf.driver\"<\/tt> global option to determine which database to use. If that is not specified either then if the RPostgreSQL, RMySQL or RH2 package is loaded (it checks in that roder) it uses the associated database and otherwise uses SQLite. Thus if you do not specify the database and you have one of those packages loaded it will think you intended to use that database. If its likely that you will have one of these packages loaded but you do not want to that package with sqldf be sure to set the sqldf.driver option, e.g. <tt>options(sqldf.driver = \"SQLite\")<\/tt> .<\/p>\n<h2><a name=\"8._Why_am_I_having_problems_with_update?\"><\/a>8. Why am I having problems with update?<\/h2>\n<p>Although data frames referenced in the SQL statement(s) passed to sqldf are automatically imported to SQLite, sqldf does not automatically export anything for safety reasons. Thus if you update a table using sqldf you must explicitly return it as shown in the examples below.<\/p>\n<p>Note that in the select statement we referred to the table as <tt>main.DF<\/tt> (<tt>main<\/tt> is always the name of the sqlite database.) If we had referred to the table as <tt>DF<\/tt> (without qualifying it as being in <tt>main<\/tt>) sqldf would have fetched <tt>DF<\/tt> from our R workspace rather than using the updated one in the sqlite database.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">3<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> b <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">3<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> NA<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">5<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"update DF set b = a where b is null\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"select * from main.DF\"<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\n\u00a0a b\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">1<\/span> <span class=\"lit\">3<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">2<\/span> <span class=\"lit\">2<\/span>\n<span class=\"lit\">3<\/span> <span class=\"lit\">3<\/span> <span class=\"lit\">5<\/span><\/pre>\n<p>One other problem can arise if the data has factors. Here we would normally get the wrong result because we are asking it to add a value to column <tt>b<\/tt> that is not among the factor levels in <tt>b<\/tt> but by using <tt>method = \"raw\"<\/tt> we can tell it not to automatically assign classes to the result.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">3<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> b <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> factor<\/span><span class=\"pun\">(<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">3<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> NA<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">5<\/span><span class=\"pun\">)));<\/span><span class=\"pln\"> DF\n\u00a0a \u00a0 \u00a0b\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">3<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">2<\/span> <span class=\"pun\">&lt;<\/span><span class=\"pln\">NA<\/span><span class=\"pun\">&gt;<\/span>\n<span class=\"lit\">3<\/span> <span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">5<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"update DF set b = a where b is null\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"select * from main.DF\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> method <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"raw\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0a b\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">1<\/span> <span class=\"lit\">3<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">2<\/span> <span class=\"lit\">2<\/span>\n<span class=\"lit\">3<\/span> <span class=\"lit\">3<\/span> <span class=\"lit\">5<\/span><\/pre>\n<p>Another way around this is to avoid the entire problem in the first place by not using a factor for <tt>b<\/tt>. If we had defined column <tt>b<\/tt> as character or numeric instead of factor then we would not have had to specify <tt>method = \"raw\"<\/tt>.<\/p>\n<h2><a name=\"9._How_do_I_examine_the_layout_that_SQLite_uses_for_a_table?_whi\"><\/a>9. How do I examine the layout that SQLite uses for a table? which tables are in the database? which databases are attached?<\/h2>\n<p>Try these approaches to get the indicated meta data:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># a. what is the layout of the BOD table?<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"pragma table_info(BOD)\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 cid \u00a0 name type notnull dflt_value pk\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">0<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Time<\/span><span class=\"pln\"> REAL \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 <\/span><span class=\"pun\">&lt;<\/span><span class=\"pln\">NA<\/span><span class=\"pun\">&gt;<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">0<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">1<\/span><span class=\"pln\"> demand REAL \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 <\/span><span class=\"pun\">&lt;<\/span><span class=\"pln\">NA<\/span><span class=\"pun\">&gt;<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">0<\/span>\n\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># b. which tables are in current database and what is their layout?<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from BOD\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"select * from sqlite_master\"<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\n\u00a0 \u00a0type name tbl_name rootpage\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> table \u00a0BOD \u00a0 \u00a0 \u00a0BOD \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">2<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 sql\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> CREATE TABLE <\/span><span class=\"str\">`BOD`<\/span> <span class=\"pun\">\\<\/span><span class=\"pln\">n<\/span><span class=\"pun\">(<\/span> <span class=\"str\">\"Time\"<\/span><span class=\"pln\"> REAL<\/span><span class=\"pun\">,\\<\/span><span class=\"pln\">n<\/span><span class=\"pun\">\\<\/span><span class=\"pln\">tdemand REAL <\/span><span class=\"pun\">\\<\/span><span class=\"pln\">n<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># c. which databases are attached? \u00a0(This says only 'main' is attached.)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"pragma database_list\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 seq name file\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">0<\/span><span class=\"pln\"> main \u00a0\n\n<\/span><span class=\"pun\">&gt;<\/span> <span class=\"com\"># d. which version of sqlite is being used?<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select sqlite_version()\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 sqlite_version<\/span><span class=\"pun\">()<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.7<\/span><span class=\"pun\">.<\/span><span class=\"lit\">17<\/span><\/pre>\n<h2><a name=\"10.__What_are_some_of_the_differences_between_using_SQLite_and_H\"><\/a>10. What are some of the differences between using SQLite and H2 with sqldf?<\/h2>\n<p>sqldf will use the H2 database instead of sqlite if the <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RH2\/\" rel=\"nofollow\">RH2<\/a> package is loaded. Features supported by H2 not supported by SQLite include Date class columns and certain <a href=\"http:\/\/www.h2database.com\/html\/functions.html\" rel=\"nofollow\">functions<\/a> such as VAR_SAMP, VAR_POP, STDDEV_SAMP, STDDEV_POP, various XML functions and CSVREAD.<\/p>\n<p><strong>Note that the examples below require RH2 0.1-2.6 or later.<\/strong><\/p>\n<p>Here are some commands. The meta commands here are specific to H2 (for SQLite&#8217;s meta data commands see <a href=\"https:\/\/code.google.com\/p\/sqldf\/#9._How_do_I_examine_the_layout_that_SQLite_uses_for_a_table?_whi\">FAQ#9<\/a>):<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">RH2<\/span><span class=\"pun\">)<\/span> <span class=\"com\"># this package contains the H2 database and an R driver<\/span><span class=\"pln\">\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select avg(demand) mean, stddev_pop(demand) from BOD where Time &gt; 4\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select Species, \"Sepal.Length\" from iris limit 3'<\/span><span class=\"pun\">)<\/span> <span class=\"com\"># Sepal.Length has dot<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"show databases\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"show tables\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"show tables from INFORMATION_SCHEMA\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from INFORMATION_SCHEMA.settings\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * FROM INFORMATION_SCHEMA.indexes\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select VALUE from INFORMATION_SCHEMA.SETTINGS where NAME = 'info.VERSION'\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> \nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"show columns from BOD\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select H2VERSION()\"<\/span><span class=\"pun\">)<\/span> <span class=\"com\"># this requires a later version of H2 than comes with RH2<\/span><\/pre>\n<p>If RH2 is loaded then it will use H2 so if you wish to use SQLite anyways then either use the drv= argument to sqldf:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from BOD\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> drv <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"SQLite\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>or set the following global option:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">options<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">.<\/span><span class=\"pln\">driver <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"SQLite\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>When using H2:<\/p>\n<ul>\n<li>in H2 a column such as Sepal.Length is not converted to Sepal_Length (which older versions of RSQLite do) but remains as Sepal.Length. For example,<\/li>\n<\/ul>\n<pre class=\"prettyprint\"><span class=\"pln\">sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select Species, avg(\"Sepal.Length\") \"Sepal Length\" from iris \n\u00a0 \u00a0group by Species order by Species'<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> drv <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"H2\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>Also sqlite orders the result above even without the order clause and h2 translates &#8220;Sepal Length&#8221; to Sepal.Length .<\/p>\n<ul>\n<li>quoting rules in H2 are stricter than in SQLite. In H2, to quote an identifier use double quotes whereas to quote a constant use single quotes.<\/li>\n<\/ul>\n<ul>\n<li>file objects are not supported. They are not really needed because H2 supports a <a href=\"http:\/\/www.h2database.com\/html\/functions.html#csvread\" rel=\"nofollow\">CSVREAD<\/a> function. Note that on Windows one can use the R notation ~ to refer to the home directory when specifying filenames if using SQLite but not with CSVREAD in H2.<\/li>\n<\/ul>\n<ul>\n<li>currently the only SQL statements supported by sqldf when using H2 are select, show and call (whereas all are supported with SQLite).<\/li>\n<\/ul>\n<ul>\n<li>H2 does not support the using clause in SQL select statements but does support on. Also it implicitly uses <tt>on<\/tt> rather than <tt>using<\/tt> in natural joins which means that selected and where condition variables that are merged in natural joins must be qualified in H2 but need not be in SQLite.<\/li>\n<\/ul>\n<pre class=\"prettyprint\"><span class=\"typ\">Abbr<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Species<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> levels<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris$Species<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> \n\u00a0 \u00a0 <\/span><span class=\"typ\">Abbr<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"S\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Ve\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Vi\"<\/span><span class=\"pun\">))<\/span>\n\n<span class=\"com\"># This works in both H2 and SQLite:<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select iris.Species, Abbr, COUNT(*) \n\u00a0 from iris natural join Abbr group by iris.Species'<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># but this only works in SQLite. \u00a0Note that Species not qualified.<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select Species, Abbr, COUNT(*) \n\u00a0 from iris natural join Abbr group by Species'<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>The examples in the Examples section are redone below using H2. Where H2 does not support the operation the SQLite code is given instead. Note that this section is a bit out of date and some of the items that it says are not supported actually are supported now.<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># 1<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select * from iris order by \"Sepal.Length\" desc limit 3'<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 2<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select Species, avg(\"Sepal.Length\") from iris group by Species'<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 3<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select iris.Species \"[Species]\",\n\u00a0 \u00a0 \u00a0 \u00a0avg(\"Sepal.Length\") \"[Avg of SLs &gt; avg SL]\"\n\u00a0 \u00a0 from iris, \n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0(select Species, avg(\"Sepal.Length\") SLavg \n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0from iris group by Species) SLavg\n\u00a0 \u00a0 where iris.Species = SLavg.Species \n\u00a0 \u00a0 \u00a0 \u00a0and \"Sepal.Length\" &gt; SLavg\n\u00a0 \u00a0 group by iris.Species'<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 4<\/span>\n<span class=\"typ\">Abbr<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Species<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> levels<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris$Species<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> \n\u00a0 \u00a0 <\/span><span class=\"typ\">Abbr<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"S\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Ve\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Vi\"<\/span><span class=\"pun\">))<\/span>\n\n<span class=\"com\"># 4a. This works:<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select iris.Species, count(*) \n\u00a0 from iris natural join Abbr group by iris.Species'<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># but this does not work (but does in sqlite) ###<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select Abbr, count(*) \n\u00a0 from iris natural join Abbr group by Species'<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 4b. \u00a0H2 does not support using but does support on (but query is longer) ###<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select Abbr, count(*) \n\u00a0 from iris join Abbr on iris.Species = Abbr.Species group by iris.Species'<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 4c.<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select Abbr, avg(\"Sepal.Length\") from iris, Abbr\n\u00a0 \u00a0 \u00a0where iris.Species = Abbr.Species group by iris.Species'<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 4d. \u00a0# This still needs to be fixed. #<\/span>\n<span class=\"kwd\">out<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select s.Species, s.dt, t.Station_id, t.Value\n\u00a0 \u00a0 \u00a0 \u00a0 from species s, temp t \n\u00a0 \u00a0 \u00a0 \u00a0 where ABS(s.dt - t.dt) = \n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 (select min(abs(s2.dt - t2.dt)) \n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 from species s2, temp t2\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 where s.Species = s2.Species and t.Station_id = t2.Station_id)\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 4e. H2 does not support using but we can use on (but query is longer) ###<\/span>\n<span class=\"com\"># Also the missing value in x seems to get filled with 0 rather than NA ###<\/span><span class=\"pln\">\nSNP1x <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> structure<\/span><span class=\"pun\">(<\/span><span class=\"pln\">list<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Animal<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"typ\">Marker<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> structure<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">5<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"pun\">.<\/span><span class=\"typ\">Label<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"P1001\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1002\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1004\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1005\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1006\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1007\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"kwd\">class<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"factor\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> \n\u00a0 \u00a0 x <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">2L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">1L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">2L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">2L<\/span><span class=\"pun\">)),<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"pun\">.<\/span><span class=\"typ\">Names<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"Animal\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Marker\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"x\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"3213\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"1295\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"915\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"2833\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"1487\"<\/span><span class=\"pun\">),<\/span> <span class=\"kwd\">class<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"data.frame\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nSNP4 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> structure<\/span><span class=\"pun\">(<\/span><span class=\"pln\">list<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Animal<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"typ\">Marker<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> structure<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">6<\/span><span class=\"pun\">,<\/span> <span class=\"pun\">.<\/span><span class=\"typ\">Label<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"P1001\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"str\">\"P1002\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1004\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1005\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1006\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1007\"<\/span><span class=\"pun\">),<\/span> <span class=\"kwd\">class<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"factor\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> \n\u00a0 \u00a0 Y <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">0.021088<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.021088<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.021088<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.021088<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.021088<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.021088<\/span><span class=\"pun\">)),<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"pun\">.<\/span><span class=\"typ\">Names<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"Animal\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Marker\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Y\"<\/span><span class=\"pun\">),<\/span> <span class=\"kwd\">class<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"data.frame\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"3213\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"1295\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"915\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"2833\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"1487\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"1885\"<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\n\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select SNP4.Animal, SNP4.Marker, Y, x \n\u00a0 \u00a0 \u00a0 \u00a0 from SNP4 left join SNP1x \n\u00a0 \u00a0 \u00a0 \u00a0 on SNP4.Animal = SNP1x.Animal and SNP4.Marker = SNP1x.Marker\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 4f. This still needs to be fixed. #<\/span><span class=\"pln\">\n\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> structure<\/span><span class=\"pun\">(<\/span><span class=\"pln\">list<\/span><span class=\"pun\">(<\/span><span class=\"pln\">tt <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">3<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">6<\/span><span class=\"pun\">)),<\/span> <span class=\"pun\">.<\/span><span class=\"typ\">Names<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"tt\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"pln\">NA<\/span><span class=\"pun\">,<\/span> \n<span class=\"pun\">-<\/span><span class=\"lit\">2L<\/span><span class=\"pun\">),<\/span> <span class=\"kwd\">class<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"data.frame\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nDF2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> structure<\/span><span class=\"pun\">(<\/span><span class=\"pln\">list<\/span><span class=\"pun\">(<\/span><span class=\"pln\">tt <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">2<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">3<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">4<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">5<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">7<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> d <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">8.3<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">10.3<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">19<\/span><span class=\"pun\">,<\/span> \n<span class=\"lit\">16<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">15.6<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">19.8<\/span><span class=\"pun\">)),<\/span> <span class=\"pun\">.<\/span><span class=\"typ\">Names<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"tt\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"d\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"pln\">NA<\/span><span class=\"pun\">,<\/span> <span class=\"pun\">-<\/span><span class=\"lit\">6L<\/span>\n<span class=\"pun\">),<\/span> <span class=\"kwd\">class<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"data.frame\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> reference <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"A1.4, p. 270\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"kwd\">out<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from DF d, DF2 a, DF2 b \n\u00a0 \u00a0 \u00a0 \u00a0 where a.row_names = b.row_names - 1 and d.tt &gt; a.tt and d.tt &lt;= b.tt\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0 \u00a0 row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 5<\/span><span class=\"pln\">\nminSL <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"lit\">7<\/span><span class=\"pln\">\nlimit <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"lit\">3<\/span><span class=\"pln\">\nfn$sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select * from iris where \"Sepal.Length\" &gt; $minSL limit $limit'<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 6a. Species get converted to upper case ###<\/span>\n\n<span class=\"com\"># \u00a0 \u00a0alternative 1<\/span><span class=\"pln\">\nwrite<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">head<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">3<\/span><span class=\"pun\">),<\/span> <span class=\"str\">\"iris3.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># convert factor to numeric<\/span><span class=\"pln\">\nfac2num <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">function<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">)<\/span> <span class=\"typ\">UseMethod<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"fac2num\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nfac2num<\/span><span class=\"pun\">.<\/span><span class=\"pln\">factor <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">function<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">)<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">numeric<\/span><span class=\"pun\">(<\/span><span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">character<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\nfac2num<\/span><span class=\"pun\">.<\/span><span class=\"pln\">data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">function<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> replace<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> lapply<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> fac2num<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\nfac2num<\/span><span class=\"pun\">.<\/span><span class=\"kwd\">default<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> identity\n\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from csvread('iris3.dat')\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> method <\/span><span class=\"pun\">=<\/span> <span class=\"kwd\">function<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> \n\u00a0 \u00a0data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">fac2num<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">[-<\/span><span class=\"lit\">5<\/span><span class=\"pun\">]),<\/span><span class=\"pln\"> x<\/span><span class=\"pun\">[<\/span><span class=\"lit\">5<\/span><span class=\"pun\">]))<\/span>\n\n<span class=\"com\"># \u00a0 \u00a0alternative 2 (H2 seems to get confused regarding case of Species)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select \n\u00a0 \u00a0cast(\"Sepal.Length\" as real) \"Sepal.Length\",\n\u00a0 \u00a0cast(\"Sepal.Width\" as real) \"Sepal.Width\",\n\u00a0 \u00a0cast(\"Petal.Length\" as real) \"Petal.Length\",\n\u00a0 \u00a0cast(\"Petal.Width\" as real) \"Petal.Width\",\n\u00a0 \u00a0SPECIES from csvread(\\'iris3.dat\\')'<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># \u00a0 \u00a0alternative 3. \u00a01st line sets up 0 row table, iris0, with correct classes &amp; 2nd line<\/span>\n<span class=\"com\"># \u00a0 \u00a0 \u00a0inserts the data from iris3.dat into it and then selects it back.<\/span><span class=\"pln\">\n\niris0 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"iris3.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> nrows <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">)[<\/span><span class=\"lit\">0L<\/span><span class=\"pun\">,<\/span> <span class=\"pun\">]<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"insert into iris0 (select * from csvread('iris3.dat'))\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"str\">\"select * from iris0\"<\/span><span class=\"pun\">))<\/span>\n\n<span class=\"com\"># 6b.<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from csvread('iris3.dat')\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dbname <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> tempfile<\/span><span class=\"pun\">(),<\/span><span class=\"pln\"> method <\/span><span class=\"pun\">=<\/span> <span class=\"kwd\">function<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">fac2num<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">[-<\/span><span class=\"lit\">5<\/span><span class=\"pun\">]),<\/span><span class=\"pln\"> x<\/span><span class=\"pun\">[<\/span><span class=\"lit\">5<\/span><span class=\"pun\">]))<\/span>\n\n<span class=\"com\"># 6c. Same answer as in 6a works whether or not there are row names<\/span>\n\n<span class=\"com\"># 6d. NA<\/span>\n\n<span class=\"com\"># 6e. <\/span>\n\n<span class=\"com\"># 6f.<\/span><span class=\"pln\">\ncat<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"1 8.3\n210.3\n\n319.0\n416.0\n515.6\n719.8\n\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> file <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"fixed\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select substr(V1, 1, 1) f1, substr(V1, 2, 4) f2 \n\u00a0 \u00a0from csvread('fixed', 'V1') limit 3\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 6g. NA<\/span>\n\n<span class=\"com\"># 7a<\/span>\n\n<span class=\"com\"># this is sqlite (how do you work with rowid's in H2?) ###<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select * from iris i \n\u00a0 \u00a0where rowid in \n\u00a0 \u00a0 (select rowid from iris where Species = i.Species order by \"Sepal.Length\" desc limit 2)\n\u00a0 \u00a0order by i.Species, i.\"Sepal.Length\" desc'<\/span><span class=\"pun\">)<\/span>\n\n\n<span class=\"com\"># 7b - same question ###<\/span><span class=\"pln\">\n\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">chron<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">101<\/span><span class=\"pun\">:<\/span><span class=\"lit\">200<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> tt <\/span><span class=\"pun\">=<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Date<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"2000-01-01\"<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">+<\/span><span class=\"pln\"> seq<\/span><span class=\"pun\">(<\/span><span class=\"lit\">0<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> len <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">100<\/span><span class=\"pun\">,<\/span> <span class=\"kwd\">by<\/span> <span class=\"pun\">=<\/span> <span class=\"lit\">2<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> cbind<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> month<\/span><span class=\"pun\">.<\/span><span class=\"pln\">day<\/span><span class=\"pun\">.<\/span><span class=\"pln\">year<\/span><span class=\"pun\">(<\/span><span class=\"pln\">unclass<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF$tt<\/span><span class=\"pun\">)))<\/span><span class=\"pln\">\n\u00a0\n<\/span><span class=\"com\"># sqlite:<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from DF d\n\u00a0 \u00a0where rowid in \n\u00a0 \u00a0 (select rowid from DF \n\u00a0 \u00a0 \u00a0 \u00a0where year = d.year and month = d.month and day &gt;= 21 limit 1)\n\u00a0 \u00a0order by tt\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 7c.<\/span><span class=\"pln\">\na <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">textConnection<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"st en\n1 4\n11 14\n3 4\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> header <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0\nb <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">textConnection<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"st en\n2 5\n3 6\n30 44\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from a where \n\u00a0 \u00a0 \u00a0 \u00a0 (select count(*) from b where a.en &gt;= b.st and b.en &gt;= a.st) &gt; 0\"<\/span><span class=\"pun\">)<\/span>\n\n\n<span class=\"com\"># 8. In H2 one uses csvread rather than file and file.format. See:<\/span>\n<span class=\"com\"># http:\/\/www.h2database.com\/html\/functions.html#csvread<\/span><span class=\"pln\">\n\nnumStr <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">character<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">100<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"pln\">numStr<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Hello\"<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\nwrite<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> file <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"tmp99.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from csvread('tmp99.csv') limit 5\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># Note that ~ does not work on Windows in H2: ###<\/span>\n<span class=\"com\"># sqldf(\"select * from csvread('~\/tmp.csv')\")<\/span>\n\n\n<span class=\"com\"># 9 - RH2 does not support. Only select statements currently. ###<\/span>\n\n<span class=\"com\"># create new empty database called mydb<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"attach 'mydb' as new\"<\/span><span class=\"pun\">)<\/span> \n\n<span class=\"com\"># create a new table, mytab, in the new database<\/span>\n<span class=\"com\"># Note that sqldf does not delete tables created from create.<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"create table mytab as select * from BOD\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dbname <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"mydb\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># shows its still there<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from mytab\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dbname <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"mydb\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 10 - RH2 does not support sqldf() ###<\/span><span class=\"pln\">\n\nsqldf<\/span><span class=\"pun\">()<\/span> \n<span class=\"com\"># uses connection just created<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select * from iris3 where \"Sepal.Width\" &gt; 3'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select * from main.iris3 where \"Sepal.Width\" = 3'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">()<\/span>\n\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 10b.<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\">#<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># Here is another way to do example 10a. \u00a0We use the same iris3,<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># iris3.dat and sqldf development version as above. \u00a0<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># We grab connection explicitly, set up the database using sqldf and then <\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># for the second call we call dbGetQuery from RSQLite. \u00a0<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># In that case we don't need to qualify iris3 as main.iris3 since<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># RSQLite would not understand R variables anyways so there is no <\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># ambiguity.<\/span>\n\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> con <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">()<\/span> \n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># uses connection just created<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select * from iris3 where \"Sepal.Width\" &gt; 3'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.5<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"pun\">&gt;<\/span><span class=\"pln\"> dbGetQuery<\/span><span class=\"pun\">(<\/span><span class=\"pln\">con<\/span><span class=\"pun\">,<\/span> <span class=\"str\">'select * from iris3 where \"Sepal.Width\" = 3'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># close<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">()<\/span>\n\n\n<span class=\"com\"># 11. Between - these work same as sqlite<\/span><span class=\"pln\">\n\nseqdf <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">thetime<\/span><span class=\"pun\">=<\/span><span class=\"pln\">seq<\/span><span class=\"pun\">(<\/span><span class=\"lit\">100<\/span><span class=\"pun\">,<\/span><span class=\"lit\">225<\/span><span class=\"pun\">,<\/span><span class=\"lit\">5<\/span><span class=\"pun\">),<\/span><span class=\"pln\">thevalue<\/span><span class=\"pun\">=<\/span><span class=\"pln\">factor<\/span><span class=\"pun\">(<\/span><span class=\"pln\">letters<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\nboundsdf <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">thestart<\/span><span class=\"pun\">=<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">110<\/span><span class=\"pun\">,<\/span><span class=\"lit\">160<\/span><span class=\"pun\">,<\/span><span class=\"lit\">200<\/span><span class=\"pun\">),<\/span><span class=\"pln\">theend<\/span><span class=\"pun\">=<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">130<\/span><span class=\"pun\">,<\/span><span class=\"lit\">180<\/span><span class=\"pun\">,<\/span><span class=\"lit\">220<\/span><span class=\"pun\">),<\/span><span class=\"pln\">groupID<\/span><span class=\"pun\">=<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">555<\/span><span class=\"pun\">,<\/span><span class=\"lit\">666<\/span><span class=\"pun\">,<\/span><span class=\"lit\">777<\/span><span class=\"pun\">))<\/span>\n\n<span class=\"com\"># run the query using two inequalities<\/span><span class=\"pln\">\ntestquery_1 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select seqdf.thetime, seqdf.thevalue, boundsdf.groupID \nfrom seqdf left join boundsdf on (seqdf.thetime &lt;= boundsdf.theend) and (seqdf.thetime &gt;= boundsdf.thestart)\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># run the same query using 'between...and' clause<\/span><span class=\"pln\">\ntestquery_2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select seqdf.thetime, seqdf.thevalue, boundsdf.groupID \nfrom seqdf LEFT JOIN boundsdf ON (seqdf.thetime BETWEEN boundsdf.thestart AND boundsdf.theend)\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># 12 combine two files - not supported by RH2 ###<\/span>\n\n<span class=\"com\"># 13 see #8<\/span><\/pre>\n<h2><a name=\"11._Why_am_I_having_difficulty_reading_a_data_file_using_SQLite\"><\/a>11. Why am I having difficulty reading a data file using SQLite and sqldf?<\/h2>\n<p>SQLite is fussy about line endings. Note the <tt>eol<\/tt> argument to <tt>read.csv.sql<\/tt> can be used to specify line endings if they are different than the normal line endings on your platform. e.g.<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"myfile.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> eol <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"\\n\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p><tt>eol<\/tt> can also be used as a component to the sqldf <tt>file.format<\/tt> argument.<\/p>\n<h2><a name=\"12._How_does_one_use_sqldf_with_PostgreSQL?\"><\/a>12. How does one use sqldf with PostgreSQL?<\/h2>\n<p>Install 1. PostgreSQL, 2. RPostgreSQL R package 3. sqldf itself. RPostgreSQL and sqldf are ordinary R package installs.<\/p>\n<p>Make sure that you have created an empty database, e.g. <tt>\"test\"<\/tt>. The createdb program that comes with PostgreSQL can be used for that. e.g. from the console\/shell create a database called test like this:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">createdb <\/span><span class=\"pun\">--<\/span><span class=\"pln\">help\ncreatedb <\/span><span class=\"pun\">--<\/span><span class=\"pln\">username<\/span><span class=\"pun\">=<\/span><span class=\"pln\">postgres test<\/span><\/pre>\n<p>Here is an example using RPostgreSQL and after that we show an example using RpgSQL. The <tt>options<\/tt> statement shown below can be entered directy or alternately can be put in your <tt>.Rprofile.<\/tt> The values shown here are actually the defaults:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">options<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">.<\/span><span class=\"typ\">RPostgreSQL<\/span><span class=\"pun\">.<\/span><span class=\"pln\">user <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"postgres\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 sqldf<\/span><span class=\"pun\">.<\/span><span class=\"typ\">RPostgreSQL<\/span><span class=\"pun\">.<\/span><span class=\"pln\">password <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"postgres\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\">\n\u00a0 sqldf<\/span><span class=\"pun\">.<\/span><span class=\"typ\">RPostgreSQL<\/span><span class=\"pun\">.<\/span><span class=\"pln\">dbname <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"test\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\">\n\u00a0 sqldf<\/span><span class=\"pun\">.<\/span><span class=\"typ\">RPostgreSQL<\/span><span class=\"pun\">.<\/span><span class=\"pln\">host <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"localhost\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 sqldf<\/span><span class=\"pun\">.<\/span><span class=\"typ\">RPostgreSQL<\/span><span class=\"pun\">.<\/span><span class=\"pln\">port <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">5432<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"typ\">Lines<\/span> <span class=\"pun\">&lt;-<\/span> <span class=\"str\">\"Group_A Group_B Group_C Value \nA1 B1 C1 10 \nA1 B1 C2 20 \nA1 B1 C3 30 \nA1 B2 C1 40 \nA1 B2 C2 10 \nA1 B2 C3 5 \nA1 B2 C4 30 \nA2 B1 C1 40 \nA2 B1 C2 5 \nA2 B1 C3 2 \nA2 B2 C1 26 \nA2 B2 C2 1 \nA2 B3 C1 23 \nA2 B3 C2 15 \nA2 B3 C3 12 \nA3 B3 C4 23 \nA3 B3 C5 23\"<\/span><span class=\"pln\">\n\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">textConnection<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Lines<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> header <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">,<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"kwd\">is<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"typ\">RPostgreSQL<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"com\"># upper case is folded to lower case by default so surround DF with double quotes<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select count(*) from \"DF\" '<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select *, rank() over \u00a0(partition by \"Group_A\", \"Group_B\" order by \"Value\") \n\u00a0 \u00a0 \u00a0 \u00a0from \"DF\" \n\u00a0 \u00a0 \u00a0 \u00a0order by \"Group_A\", \"Group_B\", \"Group_C\" '<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>For another example using <tt>over<\/tt> and <tt>partition by<\/tt> see: <a href=\"http:\/\/stackoverflow.com\/questions\/8559485\/r-cumulative-sum-by-group-in-sqldf\/8561324#8561324\" rel=\"nofollow\">this cumsum example<\/a><\/p>\n<p>Also note that <tt>log<\/tt> and <tt>log10<\/tt> in R correspond to <tt>ln<\/tt> and <tt>log<\/tt>, respectively, in PostgreSQL.<\/p>\n<h2><a name=\"13._How_does_one_deal_with_quoted_fields_in_read.csv.sql_?\"><\/a>13. How does one deal with quoted fields in <tt>read.csv.sql<\/tt>?<\/h2>\n<p><tt>read.csv.sql<\/tt> provides an interface to sqlite&#8217;s csv reader. That reader is not very flexible (but is fast) and, in particular, it does not understand quoted fields but rather regards the quotes as part of the field itself. To read a file using <tt>read.csv.sql<\/tt> and remove all double quotes from it at the same time on Windows try this assuming you have Rtools installed and on your path (or the corresponding <tt>tr<\/tt> syntax on UNIX depending on your shell):<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"myfile.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> filter <\/span><span class=\"pun\">=<\/span> <span class=\"str\">'tr.exe -d ^\" '<\/span> <span class=\"pun\">)<\/span><\/pre>\n<p>or equivalently:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"myfile.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> filter <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> list<\/span><span class=\"pun\">(<\/span><span class=\"str\">'gawk -f prog'<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> prog <\/span><span class=\"pun\">=<\/span> <span class=\"str\">'{ gsub(\/\"\/, \"\"); print }'<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">)<\/span><\/pre>\n<p>Another program to look at is the <a href=\"http:\/\/code.google.com\/p\/csvfix\/\" rel=\"nofollow\">csvfix<\/a> program (this is a free external program &#8212; not an R program). For example suppose we have commas in two contexts: (1) as separators between fields and within double quoted fields. To handle that case we can use <tt>csvfix<\/tt> to translate the separators to semicolon stripping off the double quotes at the same time (assuming we have installed <tt>csvfix<\/tt> and we have put it in our path):<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"myfile.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\";\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> filter <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"csvfix write_dsv -s ;\"<\/span><span class=\"pun\">)<\/span><span class=\"str\">` .<\/span><\/pre>\n<h2><a name=\"14._How_does_one_read_files_where_numeric_NAs_are_represented_as\"><\/a>14. How does one read files where numeric NAs are represented as missing empty fields?<\/h2>\n<p>Translate the empty fields to some number that will represent NA and then fix it up on the R end.<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># The problem is that SQLite's read routine regards empty<\/span>\n<span class=\"com\"># fields as zero length character strings rather than NA.<\/span>\n<span class=\"com\"># We handle that by replacing such strings with -999, say,<\/span>\n<span class=\"com\"># using gawk and the read.csv.sql filter argument and then<\/span>\n<span class=\"com\"># fixing it up in R later.<\/span>\n\n\n<span class=\"com\"># write out test data<\/span><span class=\"pln\">\n\ncat<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"a\\tb\\tc\naa\\t\\t23\naaa\\t34.6\\t\naaaa\\t\\t77.8\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> file <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"x.txt\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># create single line awk program to insert -999 as NA<\/span><span class=\"pln\">\n\ncat<\/span><span class=\"pun\">(<\/span><span class=\"str\">'{ gsub(\"\\t\\t\", \"\\t-999\\t\"); gsub(\"\\t$\", \"\\t-999\"); print}'<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 file <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"x.awk\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># on Windows gawk uses \\n as eol even though most<\/span>\n<span class=\"com\"># other programs use \\r\\n so we need to specify that.<\/span>\n<span class=\"com\"># eol= may or may not be needed here on other platforms.<\/span><span class=\"pln\">\n\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"x.txt\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"\\t\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> eol <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"\\n\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> filter <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"gawk -f x.awk\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># replace -999's with NA<\/span>\n\n<span class=\"kwd\">is<\/span><span class=\"pun\">.<\/span><span class=\"pln\">na<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">==<\/span> <span class=\"pun\">-<\/span><span class=\"lit\">999<\/span><\/pre>\n<p>Another program that can be used in filters is the free csvfix . For example, suppose that csvfix is on our path and that NA values are represented as NA in numeric fields. We would like to convert them to -999 and then later remove them.<\/p>\n<pre class=\"prettyprint\"><span class=\"typ\">Lines<\/span> <span class=\"pun\">&lt;-<\/span> <span class=\"str\">\"a,b\n3,NA\n4,65\"<\/span><span class=\"pln\">\ncat<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Lines<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> file <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"myfile.csv\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\nfilter <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"str\">'csvfix map -fv ,NA -tv ,-999 myfile.csv | csvfix write_dsv -s ,'<\/span><span class=\"pln\">\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"pln\">filter <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> filter<\/span><span class=\"pun\">)<\/span>\n<span class=\"kwd\">is<\/span><span class=\"pun\">.<\/span><span class=\"pln\">na<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">==<\/span> <span class=\"pun\">-<\/span><span class=\"lit\">999<\/span><\/pre>\n<p>Another way in which the input file can be malformed is that not every line has the same number of fields. In that case <tt>csvfx pad -n<\/tt> can be used to pad it out as in this example:<\/p>\n<pre class=\"prettyprint\"><span class=\"typ\">Lines<\/span> <span class=\"pun\">&lt;-<\/span> <span class=\"str\">\"a,b,c\na,b,\na,b\nq,r,t\"<\/span><span class=\"pln\">\ncat<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Lines<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> file <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"c.csv\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"pln\">filter <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"csvfix pad -n 3 c.csv | csvfix write_dsv -s ,\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<h2><a name=\"15._Why_do_certain_calculations_come_out_as_integer_rather_than\"><\/a>15. Why do certain calculations come out as integer rather than double?<\/h2>\n<p>SQLite\/RSQLite, h2\/RH2, PostgreSQL all perform integer division on integers; however, RMySQL\/MySQL performs real division.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">2<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> b <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">2<\/span><span class=\"pun\">:<\/span><span class=\"lit\">1<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> str<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">)<\/span> <span class=\"com\"># columns are integer<\/span>\n<span class=\"str\">'data.frame'<\/span><span class=\"pun\">:<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">2<\/span><span class=\"pln\"> obs<\/span><span class=\"pun\">.<\/span><span class=\"pln\"> of \u00a0<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> variables<\/span><span class=\"pun\">:<\/span><span class=\"pln\">\n\u00a0$ a<\/span><span class=\"pun\">:<\/span> <span class=\"kwd\">int<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">2<\/span><span class=\"pln\">\n\u00a0$ b<\/span><span class=\"pun\">:<\/span> <span class=\"kwd\">int<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2<\/span> <span class=\"lit\">1<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\">#<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># using sqlite - integer division<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select a\/b as quotient from DF\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 quotient\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">0<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">2<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># force real division<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select (a+0.0)\/b as quotient from DF\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 quotient\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">0.5<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">2.0<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># force real division<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select cast(a as real)\/b as quotient from DF\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 quotient\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">0.5<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">2.0<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># insert into table with real columns<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"create table mytab(a real, b real)\"<\/span><span class=\"pun\">,<\/span> \n<span class=\"pun\">+<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"str\">\"insert into mytab select * from DF\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \u00a0\n<\/span><span class=\"pun\">+<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"str\">\"select a\/b as quotient from mytab\"<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\n\u00a0 quotient\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">0.5<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">2.0<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># convert all columns to numeric using method= argument<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># Requires sqldf 0.4-0 or later<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> tonum <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">function<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> replace<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> lapply<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">,<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">numeric<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select a\/b as quotient from DF\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> method <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> list<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"auto\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> tonum<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\n\u00a0 quotient\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">0.5<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">2.0<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># use RMySQL - uses real division<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># Requires sqldf 0.4-0 or later<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"typ\">RMySQL<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select a\/b as quotient from DF\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 quotient\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">0.5<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">2.0<\/span><\/pre>\n<h2><a name=\"16._How_can_one_read_a_file_off_the_net_or_a_csv_file_in_a_zip_f\"><\/a>16. How can one read a file off the net or a csv file in a zip file?<\/h2>\n<p>Use <tt>read.csv.sql<\/tt> and specify the URL of the file:<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># 1<\/span><span class=\"pln\">\nURL <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"str\">\"http:\/\/www.wnba.com\/liberty\/media\/NYL2011ScheduleV3.csv\"<\/span><span class=\"pln\">\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"pln\">URL<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> eol <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"\\r\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>Since files off the net could have any end of line be careful to specify it properly for the file of interest.<\/p>\n<p>As an alternative one could use the filter argument. To use this <tt>wget<\/tt> (<a href=\"http:\/\/wget.addictivecode.org\/FrequentlyAskedQuestions?action=show&amp;redirect=Faq#download\" rel=\"nofollow\">download<\/a>, <a href=\"http:\/\/gnuwin32.sourceforge.net\/packages\/wget.htm\" rel=\"nofollow\">Windows<\/a>) must be present on the system command path.<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># 2 - same URL as above<\/span><span class=\"pln\">\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"pln\">eol <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"\\r\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> filter <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> paste<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"wget -O - \"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> URL<\/span><span class=\"pun\">))<\/span><\/pre>\n<p>Here is an example of reading a zip file which contains a single file that is a <tt>csv<\/tt> :<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"pln\">filter <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"7z x -so anscombe.zip 2&gt;NUL\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>In the line of code above it is assumed that <tt>7z<\/tt> (<a href=\"http:\/\/www.7-zip.org\/download.html\" rel=\"nofollow\">download<\/a>) is present and on the system command path. The example is for Windows. On UNIX use <tt>\/dev\/null<\/tt> in place of <tt>NUL<\/tt>.<\/p>\n<p>If we had a <tt>.tar.gz<\/tt> file it could be done like this:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"pln\">filter <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"tar xOfz anscombe.tar.gz\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>assuming that tar is available on our path. (Normally tar is available on Linux and on Windows its available as part of the <a href=\"http:\/\/cran.r-project.org\/bin\/windows\/Rtools\/\" rel=\"nofollow\">Rtools<\/a> distribution on CRAN.)<\/p>\n<p>Note that <tt>filter<\/tt> causes the filtered output to be stored in a temporary file and then read into sqlite. It does not actually read the data directly from the net into sqlite or directly from the zip or tar.gz file to sqlite.<\/p>\n<p><i>Note:<\/i> The examples in this section assume sqldf 0.4-4 or later.<\/p>\n<h1><a name=\"Examples\"><\/a>Examples<\/h1>\n<p>These examples illustrate usage of both sqldf and SQLite. For sqldf with H2 see <a href=\"http:\/\/code.google.com\/p\/sqldf\/#10.__What_are_some_of_the_differences_between_using_SQLite_and_H\" rel=\"nofollow\">FAQ #10<\/a>. For PostgreSQL see <a href=\"http:\/\/code.google.com\/p\/sqldf\/#12._How_does_one_use_sqldf_with_PostgreSQL?\" rel=\"nofollow\">FAQ#12<\/a>. Also the <tt>\"sqldf-unitTests\"<\/tt> demo that comes with sqldf works under sqldf with SQLite, H2, PostgreSQL and MySQL. David L. Reiner has created some further examples <a href=\"http:\/\/files.meetup.com\/1625815\/crug_sqldf_05-01-2013.pdf\" rel=\"nofollow\">here<\/a> and Paul Shannon has examples <a href=\"http:\/\/brusers.tumblr.com\/post\/59706993506\/data-manipulation-with-sqldf-paul\" rel=\"nofollow\">here<\/a>.<\/p>\n<h2><a name=\"Example_1._Ordering_and_Limiting\"><\/a>Example 1. Ordering and Limiting<\/h2>\n<p>Here is an example of sorting and limiting output from an SQL select statement on the iris data frame that comes with R. Note that although the iris dataset uses the name <tt>Sepal.Length<\/tt> older versions of the RSQLite driver convert that to <tt>Sepal_Length<\/tt>; however, newer versions do not. After installing sqldf in R, just type the first two lines into the R console (without the &gt;):<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select * from iris order by \"Sepal.Length\" desc limit 3'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">7.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.8<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.0<\/span><span class=\"pln\"> virginica\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">7.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.8<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.2<\/span><span class=\"pln\"> virginica\n<\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">7.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.6<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.3<\/span><span class=\"pln\"> virginica<\/span><\/pre>\n<h2><a name=\"Example_2._Averaging_and_Grouping\"><\/a>Example 2. Averaging and Grouping<\/h2>\n<p>Here is an example which processes an SQL select statement whose functionality is similar to the R aggregate function.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select Species, avg(\"Sepal.Length\") from iris group by Species\")\n\n\u00a0 \u00a0 \u00a0Species avg(Sepal.Length)\n1 \u00a0 \u00a0 setosa \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 5.006\n2 versicolor \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 5.936\n3 \u00a0virginica \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 6.588<\/span><\/pre>\n<h2><a name=\"Example_3._Nested_Select\"><\/a>Example 3. Nested Select<\/h2>\n<p>Here is a more complex example. For each Species, find the average Sepal Length among those rows where Sepal Length exceeds the average Sepal Length for that Species. Note the use of a subquery and explicit column naming:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select iris.Species '[Species]', \n+ \u00a0 \u00a0 \u00a0 avg(\\\"Sepal.Length\\\") '[Avg of SLs &gt; avg SL]'\n+ \u00a0 \u00a0from iris, \n+ \u00a0 \u00a0 \u00a0 \u00a0 (select Species, avg(\\\"Sepal.Length\\\") SLavg \n+ \u00a0 \u00a0 \u00a0 \u00a0 from iris group by Species) SLavg\n+ \u00a0 \u00a0where iris.Species = SLavg.Species\n+ \u00a0 \u00a0 \u00a0 and \\\"Sepal.Length\\\" &gt; SLavg\n+ \u00a0 \u00a0group by iris.Species\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\n\u00a0 \u00a0<\/span><span class=\"pun\">[<\/span><span class=\"typ\">Species<\/span><span class=\"pun\">]<\/span> <span class=\"pun\">[<\/span><span class=\"typ\">Avg<\/span><span class=\"pln\"> of <\/span><span class=\"typ\">SLs<\/span> <span class=\"pun\">&gt;<\/span><span class=\"pln\"> avg SL<\/span><span class=\"pun\">]<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 setosa \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.313636<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> versicolor \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6.375000<\/span>\n<span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0virginica \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">7.159091<\/span>\n\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># same - using only core R - based on discussion with Dennis Toddenroth<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> aggregate<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"pun\">~<\/span> <span class=\"typ\">Species<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> iris<\/span><span class=\"pun\">,<\/span> <span class=\"kwd\">function<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> mean<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">[<\/span><span class=\"pln\">x <\/span><span class=\"pun\">&gt;<\/span><span class=\"pln\"> mean<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x<\/span><span class=\"pun\">)]))<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0<\/span><span class=\"typ\">Species<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 setosa \u00a0 \u00a0 <\/span><span class=\"lit\">5.313636<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> versicolor \u00a0 \u00a0 <\/span><span class=\"lit\">6.375000<\/span>\n<span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0virginica \u00a0 \u00a0 <\/span><span class=\"lit\">7.159091<\/span><\/pre>\n<p>Note that PostgreSQL is the only free database that supports <a href=\"http:\/\/developer.postgresql.org\/pgdocs\/postgres\/tutorial-window.html\" rel=\"nofollow\">window<\/a> <a href=\"http:\/\/developer.postgresql.org\/pgdocs\/postgres\/functions-window.html\" rel=\"nofollow\">functions<\/a> (similar to <tt>ave<\/tt> function in R) which would allow a different formulation of the above. For more on using sqldf with PostgreSQL see <a href=\"http:\/\/code.google.com\/p\/sqldf\/#12._How_does_one_use_sqldf_with_PostgreSQL?\" rel=\"nofollow\">FAQ #12<\/a><\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"typ\">RPostgreSQL<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> tmp <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select \n+ \u00a0 \u00a0 \u00a0 \"Species\", \n+ \u00a0 \u00a0 \u00a0 \"Sepal.Length\", \n+ \u00a0 \u00a0 \u00a0 \"Sepal.Length\" - avg(\"Sepal.Length\") over (partition by \"Species\") \"above.mean\" \n+ \u00a0 \u00a0 from iris'<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select \"Species\", avg(\"Sepal.Length\") \n+ \u00a0 \u00a0 \u00a0 \u00a0from tmp \n+ \u00a0 \u00a0 \u00a0 \u00a0where \"above.mean\" &gt; 0 \n+ \u00a0 \u00a0 \u00a0 \u00a0group by \"Species\"'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0<\/span><span class=\"typ\">Species<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0avg\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 setosa <\/span><span class=\"lit\">5.313636<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0virginica <\/span><span class=\"lit\">7.159091<\/span>\n<span class=\"lit\">3<\/span><span class=\"pln\"> versicolor <\/span><span class=\"lit\">6.375000<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># or, alternately, we could perform the above two steps in a single statement:<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'\n+ \u00a0select \"Species\", avg(\"Sepal.Length\") \n+ \u00a0from \n+ \u00a0 \u00a0 (select \"Species\", \n+ \u00a0 \u00a0 \u00a0 \u00a0 \"Sepal.Length\", \n+ \u00a0 \u00a0 \u00a0 \u00a0 \"Sepal.Length\" - avg(\"Sepal.Length\") over (partition by \"Species\") \"above.mean\" \n+ \u00a0 \u00a0 from iris) a \n+ \u00a0where \"above.mean\" &gt; 0 \n+ \u00a0group by \"Species\"'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0<\/span><span class=\"typ\">Species<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0avg\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 setosa <\/span><span class=\"lit\">5.313636<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> versicolor <\/span><span class=\"lit\">6.375000<\/span>\n<span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0virginica <\/span><span class=\"lit\">7.159091<\/span><\/pre>\n<p>which in R corresponds to this R code (i.e. <tt>partition...over<\/tt> in PostgreSQL corresponds to <tt>ave<\/tt> in R):<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> tmp <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">with<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris<\/span><span class=\"pun\">,<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"pun\">-<\/span><span class=\"pln\"> ave<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> iris<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> FUN <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> mean<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> aggregate<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"pun\">~<\/span> <span class=\"typ\">Species<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> subset<\/span><span class=\"pun\">(<\/span><span class=\"pln\">tmp<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> above<\/span><span class=\"pun\">.<\/span><span class=\"pln\">mean <\/span><span class=\"pun\">&gt;<\/span> <span class=\"lit\">0<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> mean<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0<\/span><span class=\"typ\">Species<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 setosa \u00a0 \u00a0 <\/span><span class=\"lit\">5.313636<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> versicolor \u00a0 \u00a0 <\/span><span class=\"lit\">6.375000<\/span>\n<span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0virginica \u00a0 \u00a0 <\/span><span class=\"lit\">7.159091<\/span><\/pre>\n<p>Here is some sample data with the correlated subquery from this <a href=\"http:\/\/en.wikipedia.org\/wiki\/Correlated_subquery\" rel=\"nofollow\">Wikipedia page<\/a>:<\/p>\n<pre class=\"prettyprint\"><span class=\"typ\">Emp<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">emp <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> letters<\/span><span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">24<\/span><span class=\"pun\">],<\/span><span class=\"pln\"> salary <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">24<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dept <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> rep<\/span><span class=\"pun\">(<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"A\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"B\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"C\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> each <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">8<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\n\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"SELECT *\n\u00a0FROM Emp AS e1\n\u00a0WHERE salary &gt; (SELECT avg(salary)\n\u00a0 \u00a0 FROM Emp\n\u00a0 \u00a0 WHERE dept = e1.dept)\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<h2><a name=\"Example_4._Join\"><\/a>Example 4. Join<\/h2>\n<p>The different type of joins are pictured in this image: i.imgur.com\/1m55Wqo.jpg. (SQLite does not support right joins but the other databases sqldf supports do.) We define a new data frame, <tt>Abbr<\/tt>, join it with <tt>iris<\/tt> and perform the aggregation:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 4a.<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"typ\">Abbr<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Species<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> levels<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris$Species<\/span><span class=\"pun\">),<\/span> \n<span class=\"pun\">+<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"typ\">Abbr<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"S\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Ve\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Vi\"<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select Abbr, avg(\"Sepal.Length\") \n+ \u00a0 from iris natural join Abbr group by Species'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\n\u00a0 <\/span><span class=\"typ\">Abbr<\/span><span class=\"pln\"> avg<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span><span class=\"pun\">)<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0S \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">5.006<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Ve<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">5.936<\/span>\n<span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Vi<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">6.588<\/span><\/pre>\n<p>Although the above is probably the shortest way to write it in SQL, using <tt>natural join<\/tt> can be a bit dangerous since one must be very sure one knows precisely which column names are common to both tables. For example, had we included the <tt>row_names<\/tt> as a column in both tables (by specifying <tt>row.names = TRUE<\/tt> to sqldf) the natural join would not work as intended since the <tt>row_names<\/tt> columns would participate in the join. An alternate and safer way to write this would be with <tt>join<\/tt> and <tt>using<\/tt>:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 4b.<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select Abbr, avg(\"Sepal.Length\") \n+ \u00a0 from iris join Abbr using(Species) group by Species'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\n\u00a0 <\/span><span class=\"typ\">Abbr<\/span><span class=\"pln\"> avg<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span><span class=\"pun\">)<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0S \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">5.006<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Ve<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">5.936<\/span>\n<span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Vi<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">6.588<\/span><\/pre>\n<p>or with a <tt>where<\/tt> clause:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 4c.<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select Abbr, avg(\"Sepal.Length\") from iris, Abbr\n+ \u00a0 \u00a0where iris.Species = Abbr.Species group by iris.Species'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\n\u00a0 <\/span><span class=\"typ\">Abbr<\/span><span class=\"pln\"> avg<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span><span class=\"pun\">)<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0S \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">5.006<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Ve<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">5.936<\/span>\n<span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Vi<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">6.588<\/span><\/pre>\n<p>or a temporal join where the goal is, for each Species\/station_id pair, to join the records with the closest date\/times.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 4d. Temporal Join<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># see: https:\/\/stat.ethz.ch\/pipermail\/r-help\/2009-March\/191938.html<\/span>\n<span class=\"pun\">&gt;<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">chron<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"typ\">Species<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Lines<\/span> <span class=\"pun\">&lt;-<\/span> <span class=\"str\">\"Species,Date_Sampled\n+ SpeciesB,2008-06-23 13:55:11\n+ SpeciesA,2008-06-23 13:43:11\n+ SpeciesC,2008-06-23 13:55:11\"<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> species <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">(<\/span><span class=\"pln\">textConnection<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Species<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Lines<\/span><span class=\"pun\">),<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"kwd\">is<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> species$dt <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">numeric<\/span><span class=\"pun\">(<\/span><span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">chron<\/span><span class=\"pun\">(<\/span><span class=\"pln\">species$Date<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"typ\">Temp<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Lines<\/span> <span class=\"pun\">&lt;-<\/span> <span class=\"str\">\"Station_id,Date,Value\n+ ANH,2008-06-23 13:00:00,1.96\n+ ANH,2008-06-23 14:00:00,2.25\n+ BDT,2008-06-23 13:00:00,4.23\n+ BDT,2008-06-23 13:15:00,4.11\n+ BDT,2008-06-23 13:30:00,4.01\n+ BDT,2008-06-23 13:45:00,3.9\n+ BDT,2008-06-23 14:00:00,3.82\"<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> temp <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">(<\/span><span class=\"pln\">textConnection<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Temp<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Lines<\/span><span class=\"pun\">),<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"kwd\">is<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> temp$dt <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">numeric<\/span><span class=\"pun\">(<\/span><span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">chron<\/span><span class=\"pun\">(<\/span><span class=\"pln\">temp$Date<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"kwd\">out<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select s.Species, s.dt, t.Station_id, t.Value\n+ from species s, temp t \n+ where abs(s.dt - t.dt) = \n+ (select min(abs(s2.dt - t2.dt)) \n+ from species s2, temp t2\n+ where s.Species = s2.Species and t.Station_id = t2.Station_id)\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"kwd\">out<\/span><span class=\"pln\">$dt <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> chron<\/span><span class=\"pun\">(<\/span><span class=\"kwd\">out<\/span><span class=\"pln\">$dt<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"kwd\">out<\/span><span class=\"pln\">\n\u00a0 \u00a0<\/span><span class=\"typ\">Species<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0dt <\/span><span class=\"typ\">Station_id<\/span> <span class=\"typ\">Value<\/span>\n<span class=\"lit\">1<\/span> <span class=\"typ\">SpeciesB<\/span> <span class=\"pun\">(<\/span><span class=\"lit\">06<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">23<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">08<\/span> <span class=\"lit\">13<\/span><span class=\"pun\">:<\/span><span class=\"lit\">55<\/span><span class=\"pun\">:<\/span><span class=\"lit\">11<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0ANH \u00a0 \u00a0 <\/span><span class=\"lit\">2.25<\/span>\n<span class=\"lit\">2<\/span> <span class=\"typ\">SpeciesB<\/span> <span class=\"pun\">(<\/span><span class=\"lit\">06<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">23<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">08<\/span> <span class=\"lit\">13<\/span><span class=\"pun\">:<\/span><span class=\"lit\">55<\/span><span class=\"pun\">:<\/span><span class=\"lit\">11<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0BDT \u00a0 \u00a0 <\/span><span class=\"lit\">3.82<\/span>\n<span class=\"lit\">3<\/span> <span class=\"typ\">SpeciesA<\/span> <span class=\"pun\">(<\/span><span class=\"lit\">06<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">23<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">08<\/span> <span class=\"lit\">13<\/span><span class=\"pun\">:<\/span><span class=\"lit\">43<\/span><span class=\"pun\">:<\/span><span class=\"lit\">11<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0ANH \u00a0 \u00a0 <\/span><span class=\"lit\">2.25<\/span>\n<span class=\"lit\">4<\/span> <span class=\"typ\">SpeciesA<\/span> <span class=\"pun\">(<\/span><span class=\"lit\">06<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">23<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">08<\/span> <span class=\"lit\">13<\/span><span class=\"pun\">:<\/span><span class=\"lit\">43<\/span><span class=\"pun\">:<\/span><span class=\"lit\">11<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0BDT \u00a0 \u00a0 <\/span><span class=\"lit\">3.90<\/span>\n<span class=\"lit\">5<\/span> <span class=\"typ\">SpeciesC<\/span> <span class=\"pun\">(<\/span><span class=\"lit\">06<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">23<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">08<\/span> <span class=\"lit\">13<\/span><span class=\"pun\">:<\/span><span class=\"lit\">55<\/span><span class=\"pun\">:<\/span><span class=\"lit\">11<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0ANH \u00a0 \u00a0 <\/span><span class=\"lit\">2.25<\/span>\n<span class=\"lit\">6<\/span> <span class=\"typ\">SpeciesC<\/span> <span class=\"pun\">(<\/span><span class=\"lit\">06<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">23<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">08<\/span> <span class=\"lit\">13<\/span><span class=\"pun\">:<\/span><span class=\"lit\">55<\/span><span class=\"pun\">:<\/span><span class=\"lit\">11<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0BDT \u00a0 \u00a0 <\/span><span class=\"lit\">3.82<\/span><\/pre>\n<p>A similar but slightly simpler example can be found <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-sig-finance\/2010q2\/006077.html\" rel=\"nofollow\">here<\/a>.<\/p>\n<p>Here is an example of a left join:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 4e. Left Join<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># https:\/\/stat.ethz.ch\/pipermail\/r-help\/2009-April\/195882.html<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\">#<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> SNP1x <\/span><span class=\"pun\">&lt;-<\/span>\n<span class=\"pun\">+<\/span><span class=\"pln\"> structure<\/span><span class=\"pun\">(<\/span><span class=\"pln\">list<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Animal<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> \n<span class=\"pun\">+<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">),<\/span> <span class=\"typ\">Marker<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> structure<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">5<\/span><span class=\"pun\">,<\/span> <span class=\"pun\">.<\/span><span class=\"typ\">Label<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"P1001\"<\/span><span class=\"pun\">,<\/span> \n<span class=\"pun\">+<\/span> <span class=\"str\">\"P1002\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1004\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1005\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1006\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1007\"<\/span><span class=\"pun\">),<\/span> <span class=\"kwd\">class<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"factor\"<\/span><span class=\"pun\">),<\/span> \n<span class=\"pun\">+<\/span><span class=\"pln\"> \u00a0 \u00a0 x <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">2L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">1L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">2L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">2L<\/span><span class=\"pun\">)),<\/span> <span class=\"pun\">.<\/span><span class=\"typ\">Names<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"Animal\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Marker\"<\/span><span class=\"pun\">,<\/span> \n<span class=\"pun\">+<\/span> <span class=\"str\">\"x\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"3213\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"1295\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"915\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"2833\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"1487\"<\/span><span class=\"pun\">),<\/span> <span class=\"kwd\">class<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"data.frame\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> SNP4 <\/span><span class=\"pun\">&lt;-<\/span> \n<span class=\"pun\">+<\/span><span class=\"pln\"> structure<\/span><span class=\"pun\">(<\/span><span class=\"pln\">list<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Animal<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> \n<span class=\"pun\">+<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">194073197L<\/span><span class=\"pun\">),<\/span> <span class=\"typ\">Marker<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> structure<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">6<\/span><span class=\"pun\">,<\/span> <span class=\"pun\">.<\/span><span class=\"typ\">Label<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"P1001\"<\/span><span class=\"pun\">,<\/span> \n<span class=\"pun\">+<\/span> <span class=\"str\">\"P1002\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1004\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1005\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1006\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"P1007\"<\/span><span class=\"pun\">),<\/span> <span class=\"kwd\">class<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"factor\"<\/span><span class=\"pun\">),<\/span> \n<span class=\"pun\">+<\/span><span class=\"pln\"> \u00a0 \u00a0 Y <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">0.021088<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.021088<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.021088<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.021088<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.021088<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.021088<\/span>\n<span class=\"pun\">+<\/span><span class=\"pln\"> \u00a0 \u00a0 <\/span><span class=\"pun\">)),<\/span> <span class=\"pun\">.<\/span><span class=\"typ\">Names<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"Animal\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Marker\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Y\"<\/span><span class=\"pun\">),<\/span> <span class=\"kwd\">class<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"data.frame\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"3213\"<\/span><span class=\"pun\">,<\/span> \n<span class=\"pun\">+<\/span> <span class=\"str\">\"1295\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"915\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"2833\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"1487\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"1885\"<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> SNP1x\n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"typ\">Animal<\/span> <span class=\"typ\">Marker<\/span><span class=\"pln\"> x\n<\/span><span class=\"lit\">3213<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1001 <\/span><span class=\"lit\">2<\/span>\n<span class=\"lit\">1295<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1002 <\/span><span class=\"lit\">1<\/span>\n<span class=\"lit\">915<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1004 <\/span><span class=\"lit\">2<\/span>\n<span class=\"lit\">2833<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1005 <\/span><span class=\"lit\">0<\/span>\n<span class=\"lit\">1487<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1006 <\/span><span class=\"lit\">2<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> SNP4\n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"typ\">Animal<\/span> <span class=\"typ\">Marker<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0Y\n<\/span><span class=\"lit\">3213<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1001 <\/span><span class=\"lit\">0.021088<\/span>\n<span class=\"lit\">1295<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1002 <\/span><span class=\"lit\">0.021088<\/span>\n<span class=\"lit\">915<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1004 <\/span><span class=\"lit\">0.021088<\/span>\n<span class=\"lit\">2833<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1005 <\/span><span class=\"lit\">0.021088<\/span>\n<span class=\"lit\">1487<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1006 <\/span><span class=\"lit\">0.021088<\/span>\n<span class=\"lit\">1885<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1007 <\/span><span class=\"lit\">0.021088<\/span>\n<span class=\"pun\">&gt;<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from SNP4 left join SNP1x using (Animal, Marker)\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0<\/span><span class=\"typ\">Animal<\/span> <span class=\"typ\">Marker<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0Y \u00a0x\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1001 <\/span><span class=\"lit\">0.021088<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1002 <\/span><span class=\"lit\">0.021088<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">1<\/span>\n<span class=\"lit\">3<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1004 <\/span><span class=\"lit\">0.021088<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2<\/span>\n<span class=\"lit\">4<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1005 <\/span><span class=\"lit\">0.021088<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">0<\/span>\n<span class=\"lit\">5<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1006 <\/span><span class=\"lit\">0.021088<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2<\/span>\n<span class=\"lit\">6<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1007 <\/span><span class=\"lit\">0.021088<\/span><span class=\"pln\"> NA\n<\/span><span class=\"pun\">&gt;<\/span> <span class=\"com\"># or if that takes up too much memory <\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># create\/use\/destroy external database<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from SNP4 left join SNP1x using (Animal, Marker)\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dbname <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"test.db\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0<\/span><span class=\"typ\">Animal<\/span> <span class=\"typ\">Marker<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0Y \u00a0x\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1001 <\/span><span class=\"lit\">0.021088<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1002 <\/span><span class=\"lit\">0.021088<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">1<\/span>\n<span class=\"lit\">3<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1004 <\/span><span class=\"lit\">0.021088<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2<\/span>\n<span class=\"lit\">4<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1005 <\/span><span class=\"lit\">0.021088<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">0<\/span>\n<span class=\"lit\">5<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1006 <\/span><span class=\"lit\">0.021088<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2<\/span>\n<span class=\"lit\">6<\/span> <span class=\"lit\">194073197<\/span><span class=\"pln\"> \u00a0P1007 <\/span><span class=\"lit\">0.021088<\/span><span class=\"pln\"> NA<\/span><\/pre>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 4f. \u00a0Another temporal join.<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># join DF2 to row in DF for which DF.tt and DF2.tt are closest<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> structure<\/span><span class=\"pun\">(<\/span><span class=\"pln\">list<\/span><span class=\"pun\">(<\/span><span class=\"pln\">tt <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">3<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">6<\/span><span class=\"pun\">)),<\/span> <span class=\"pun\">.<\/span><span class=\"typ\">Names<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"tt\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"pln\">NA<\/span><span class=\"pun\">,<\/span> \n<span class=\"pun\">+<\/span> <span class=\"pun\">-<\/span><span class=\"lit\">2L<\/span><span class=\"pun\">),<\/span> <span class=\"kwd\">class<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"data.frame\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF\n\u00a0 tt\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">3<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">6<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> structure<\/span><span class=\"pun\">(<\/span><span class=\"pln\">list<\/span><span class=\"pun\">(<\/span><span class=\"pln\">tt <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">2<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">3<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">4<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">5<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">7<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> d <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">8.3<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">10.3<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">19<\/span><span class=\"pun\">,<\/span> \n<span class=\"pun\">+<\/span> <span class=\"lit\">16<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">15.6<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">19.8<\/span><span class=\"pun\">)),<\/span> <span class=\"pun\">.<\/span><span class=\"typ\">Names<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"tt\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"d\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"pln\">NA<\/span><span class=\"pun\">,<\/span> <span class=\"pun\">-<\/span><span class=\"lit\">6L<\/span>\n<span class=\"pun\">+<\/span> <span class=\"pun\">),<\/span> <span class=\"kwd\">class<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"data.frame\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> reference <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"A1.4, p. 270\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF2\n\u00a0 tt \u00a0 \u00a0d\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">8.3<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2<\/span> <span class=\"lit\">10.3<\/span>\n<span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">3<\/span> <span class=\"lit\">19.0<\/span>\n<span class=\"lit\">4<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">4<\/span> <span class=\"lit\">16.0<\/span>\n<span class=\"lit\">5<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">5<\/span> <span class=\"lit\">15.6<\/span>\n<span class=\"lit\">6<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">7<\/span> <span class=\"lit\">19.8<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"kwd\">out<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from DF d, DF2 a, DF2 b \n+ where a.row_names = b.row_names - 1 \n+ and d.tt &gt; a.tt and d.tt &lt;= b.tt\"<\/span><span class=\"pun\">,<\/span> \n<span class=\"pun\">+<\/span><span class=\"pln\"> row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> \u00a0\n<\/span><span class=\"pun\">&gt;<\/span> <span class=\"kwd\">out<\/span><span class=\"pln\">$dd <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">with<\/span><span class=\"pun\">(<\/span><span class=\"kwd\">out<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> ifelse<\/span><span class=\"pun\">(<\/span><span class=\"pln\">tt <\/span><span class=\"pun\">&lt;<\/span> <span class=\"pun\">(<\/span><span class=\"pln\">tt<\/span><span class=\"pun\">.<\/span><span class=\"lit\">1<\/span> <span class=\"pun\">+<\/span><span class=\"pln\"> tt<\/span><span class=\"pun\">.<\/span><span class=\"lit\">2<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">\/<\/span> <span class=\"lit\">2<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> d<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> d<\/span><span class=\"pun\">.<\/span><span class=\"lit\">1<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"kwd\">out<\/span><span class=\"pln\">\n\u00a0 tt tt<\/span><span class=\"pun\">.<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0d tt<\/span><span class=\"pun\">.<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0d<\/span><span class=\"pun\">.<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 dd\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">2<\/span> <span class=\"lit\">10.3<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">3<\/span> <span class=\"lit\">19.0<\/span> <span class=\"lit\">19.0<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">6<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">5<\/span> <span class=\"lit\">15.6<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">7<\/span> <span class=\"lit\">19.8<\/span> <span class=\"lit\">19.8<\/span><\/pre>\n<p>Example 4g. Self Join. There is an example of a self-join here: <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2010-March\/232314.html\" rel=\"nofollow\">problem<\/a> and answer here:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> structure<\/span><span class=\"pun\">(<\/span><span class=\"pln\">list<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Actor<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"Jim\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Bob\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Bob\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Larry\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Alice\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Tom\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Tom\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Tom\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Alice\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Nancy\"<\/span><span class=\"pun\">),<\/span> <span class=\"typ\">Act<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"A\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"A\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"C\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"str\">\"D\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"C\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"F\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"D\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"A\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"B\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"B\"<\/span><span class=\"pun\">)),<\/span> <span class=\"pun\">.<\/span><span class=\"typ\">Names<\/span> <span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"Actor\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Act\"<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"pun\">),<\/span> <span class=\"kwd\">class<\/span> <span class=\"pun\">=<\/span> <span class=\"str\">\"data.frame\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"pln\">NA<\/span><span class=\"pun\">,<\/span> <span class=\"pun\">-<\/span><span class=\"lit\">10L<\/span><span class=\"pun\">))<\/span>\n\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> subset<\/span><span class=\"pun\">(<\/span><span class=\"pln\">unique<\/span><span class=\"pun\">(<\/span><span class=\"pln\">merge<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> DF<\/span><span class=\"pun\">,<\/span> <span class=\"kwd\">by<\/span> <span class=\"pun\">=<\/span> <span class=\"lit\">2<\/span><span class=\"pun\">)),<\/span> <span class=\"typ\">Actor<\/span><span class=\"pun\">.<\/span><span class=\"pln\">x <\/span><span class=\"pun\">&lt;<\/span> <span class=\"typ\">Actor<\/span><span class=\"pun\">.<\/span><span class=\"pln\">y<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0<\/span><span class=\"typ\">Act<\/span> <span class=\"typ\">Actor<\/span><span class=\"pun\">.<\/span><span class=\"pln\">x <\/span><span class=\"typ\">Actor<\/span><span class=\"pun\">.<\/span><span class=\"pln\">y\n<\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0A \u00a0 \u00a0 <\/span><span class=\"typ\">Jim<\/span><span class=\"pln\"> \u00a0 \u00a0 <\/span><span class=\"typ\">Tom<\/span>\n<span class=\"lit\">4<\/span><span class=\"pln\"> \u00a0 \u00a0A \u00a0 \u00a0 <\/span><span class=\"typ\">Bob<\/span><span class=\"pln\"> \u00a0 \u00a0 <\/span><span class=\"typ\">Jim<\/span>\n<span class=\"lit\">6<\/span><span class=\"pln\"> \u00a0 \u00a0A \u00a0 \u00a0 <\/span><span class=\"typ\">Bob<\/span><span class=\"pln\"> \u00a0 \u00a0 <\/span><span class=\"typ\">Tom<\/span>\n<span class=\"lit\">11<\/span><span class=\"pln\"> \u00a0 B \u00a0 <\/span><span class=\"typ\">Alice<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Nancy<\/span>\n<span class=\"lit\">16<\/span><span class=\"pln\"> \u00a0 C \u00a0 <\/span><span class=\"typ\">Alice<\/span><span class=\"pln\"> \u00a0 \u00a0 <\/span><span class=\"typ\">Bob<\/span>\n<span class=\"lit\">20<\/span><span class=\"pln\"> \u00a0 D \u00a0 <\/span><span class=\"typ\">Larry<\/span><span class=\"pln\"> \u00a0 \u00a0 <\/span><span class=\"typ\">Tom<\/span>\n\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select A.Act, A.Actor, B.Actor\n+ \u00a0 from DF A join DF B\n+ \u00a0 \u00a0 where A.Act = B.Act and A.Actor &lt; B.Actor\n+ \u00a0 \u00a0 \u00a0 order by A.Act, A.Actor\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 <\/span><span class=\"typ\">Act<\/span> <span class=\"typ\">Actor<\/span> <span class=\"typ\">Actor<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 A \u00a0 <\/span><span class=\"typ\">Bob<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Jim<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 A \u00a0 <\/span><span class=\"typ\">Bob<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Tom<\/span>\n<span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 A \u00a0 <\/span><span class=\"typ\">Jim<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Tom<\/span>\n<span class=\"lit\">4<\/span><span class=\"pln\"> \u00a0 B <\/span><span class=\"typ\">Alice<\/span> <span class=\"typ\">Nancy<\/span>\n<span class=\"lit\">5<\/span><span class=\"pln\"> \u00a0 C <\/span><span class=\"typ\">Alice<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Bob<\/span>\n<span class=\"lit\">6<\/span><span class=\"pln\"> \u00a0 D <\/span><span class=\"typ\">Larry<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Tom<\/span><\/pre>\n<p>to Raj Morejoys for correction.<\/p>\n<p>Here is an <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2011-February\/269680.html\" rel=\"nofollow\">another example of a self join<\/a> to create pairs which is followed by a second self join to produce pairs of pairs. This <a href=\"http:\/\/stackoverflow.com\/questions\/11448133\/double-merge-two-data-frames-in-r\" rel=\"nofollow\">stackoverflow example<\/a> illustrates an sqldf triple join in which one table participates twice.<\/p>\n<p>Example 4h. Join nearby times. There is an example of joining records that are close but not necessarily exactly the same here: <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2010-March\/232588.html\" rel=\"nofollow\">problem<\/a> and<a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/attachments\/20100320\/4ccb548f\/attachment.pl\" rel=\"nofollow\">answer<\/a> . Also taking successive differences involves joining adjacent times and this is illustrated <a href=\"http:\/\/stackoverflow.com\/questions\/6695673\/find-standard-deviation-of-first-differences-of-series-defined-with-group-by-usin\" rel=\"nofollow\">here<\/a> .<\/p>\n<p>Here is an example where we align time series Sy to series Sx by averaging all points of Sy within w = 0.25 units of each Sx time point. Tx and X are the times and values of Sx and Ty and Y are the times and values of Sy.<\/p>\n<pre class=\"prettyprint\"><span class=\"typ\">Tx<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> seq<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> N<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.5<\/span><span class=\"pun\">)<\/span>\n<span class=\"typ\">Tx<\/span> <span class=\"pun\">&lt;-<\/span> <span class=\"typ\">Tx<\/span> <span class=\"pun\">+<\/span><span class=\"pln\"> rnorm<\/span><span class=\"pun\">(<\/span><span class=\"pln\">length<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Tx<\/span><span class=\"pun\">),<\/span> <span class=\"lit\">0<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.1<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nX <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sin<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Tx<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">10.0<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">+<\/span><span class=\"pln\"> \u00a0sin<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Tx<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">5.0<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">+<\/span><span class=\"pln\"> rnorm<\/span><span class=\"pun\">(<\/span><span class=\"pln\">length<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Tx<\/span><span class=\"pun\">),<\/span> <span class=\"lit\">0<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.1<\/span><span class=\"pun\">)<\/span>\n<span class=\"typ\">Ty<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> seq<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> N<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.3333<\/span><span class=\"pun\">)<\/span>\n<span class=\"typ\">Ty<\/span> <span class=\"pun\">&lt;-<\/span> <span class=\"typ\">Ty<\/span> <span class=\"pun\">+<\/span><span class=\"pln\"> rnorm<\/span><span class=\"pun\">(<\/span><span class=\"pln\">length<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Ty<\/span><span class=\"pun\">),<\/span> <span class=\"lit\">0<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.02<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nY <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sin<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Ty<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">10.0<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">+<\/span><span class=\"pln\"> sin<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Ty<\/span><span class=\"pun\">\/<\/span><span class=\"lit\">5.0<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">+<\/span><span class=\"pln\"> rnorm<\/span><span class=\"pun\">(<\/span><span class=\"pln\">length<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Ty<\/span><span class=\"pun\">),<\/span> <span class=\"lit\">0<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">0.1<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nw <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"lit\">0.25<\/span><span class=\"pln\">\n\nsystem<\/span><span class=\"pun\">.<\/span><span class=\"pln\">time<\/span><span class=\"pun\">(<\/span><span class=\"pln\">out1 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sapply<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Tx<\/span><span class=\"pun\">,<\/span> <span class=\"kwd\">function<\/span><span class=\"pun\">(<\/span><span class=\"pln\">tx<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> mean<\/span><span class=\"pun\">(<\/span><span class=\"pln\">Y<\/span><span class=\"pun\">[<\/span><span class=\"typ\">Ty<\/span> <span class=\"pun\">&gt;=<\/span><span class=\"pln\"> tx<\/span><span class=\"pun\">-<\/span><span class=\"pln\">w <\/span><span class=\"pun\">&amp;<\/span> <span class=\"typ\">Ty<\/span> <span class=\"pun\">&lt;=<\/span><span class=\"pln\"> tx<\/span><span class=\"pun\">+<\/span><span class=\"pln\">w<\/span><span class=\"pun\">])))<\/span><span class=\"pln\">\n\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"typ\">Sx<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Tx<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> X<\/span><span class=\"pun\">)<\/span>\n<span class=\"typ\">Sy<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Ty<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> Y<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\nsystem<\/span><span class=\"pun\">.<\/span><span class=\"pln\">time<\/span><span class=\"pun\">(<\/span><span class=\"kwd\">out<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sqldf <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"create index idx on Sx(Tx)\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\">\n\u00a0 <\/span><span class=\"str\">\"select Tx, avg(Y) from main.Sx, Sy\n\u00a0 where Ty + 0.25 &gt;= Tx and Ty - 0.25 &lt;= Tx group by Tx\"<\/span><span class=\"pun\">)))<\/span><span class=\"pln\">\n\nall<\/span><span class=\"pun\">.<\/span><span class=\"pln\">equal<\/span><span class=\"pun\">(<\/span><span class=\"kwd\">out<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">[,<\/span><span class=\"lit\">2<\/span><span class=\"pun\">],<\/span><span class=\"pln\"> out1<\/span><span class=\"pun\">)<\/span> <span class=\"com\"># TRUE<\/span><\/pre>\n<p>Example 4i. Speeding up joins with indexes. Here is an example of speeding up a join by using indexes on a single join column <a href=\"http:\/\/statcompute.wordpress.com\/2013\/06\/09\/improve-the-efficiency-in-joining-data-with-index\/\" rel=\"nofollow\">here<\/a> and <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2010-March\/232688.html\" rel=\"nofollow\">here<\/a>and on two join columns below. Note that the <tt>create index<\/tt> statements in each example also has the effect of reading in the data frames into the <tt>main<\/tt> database of SQLite. The <tt>select<\/tt> statement refers to <tt>main.DF1<\/tt> rather than just <tt>DF1<\/tt> so that it accesses that copy of <tt>DF1<\/tt> in <tt>main<\/tt> which we just indexed rather than the unindexed <tt>DF1<\/tt> in R. Similar comments apply to <tt>DF2<\/tt>. The statement <tt>sqldf(\"select * from sqlite_master\")<\/tt>will list the names and related info for all tables in <tt>main<\/tt>.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"kwd\">set<\/span><span class=\"pun\">.<\/span><span class=\"pln\">seed<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> n <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"lit\">1000000<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF1 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> sample<\/span><span class=\"pun\">(<\/span><span class=\"pln\">n<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> n<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> replace <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">),<\/span> \n<span class=\"pun\">+<\/span><span class=\"pln\"> b <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> sample<\/span><span class=\"pun\">(<\/span><span class=\"lit\">4<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> n<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> replace <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> c1 <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> runif<\/span><span class=\"pun\">(<\/span><span class=\"pln\">n<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> sample<\/span><span class=\"pun\">(<\/span><span class=\"pln\">n<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> n<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> replace <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">),<\/span> \n<span class=\"pun\">+<\/span><span class=\"pln\"> b <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> sample<\/span><span class=\"pun\">(<\/span><span class=\"lit\">4<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> n<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> replace <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> c2 <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> runif<\/span><span class=\"pun\">(<\/span><span class=\"pln\">n<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"typ\">Loading<\/span><span class=\"pln\"> required <\/span><span class=\"kwd\">package<\/span><span class=\"pun\">:<\/span><span class=\"pln\"> DBI\n<\/span><span class=\"typ\">Loading<\/span><span class=\"pln\"> required <\/span><span class=\"kwd\">package<\/span><span class=\"pun\">:<\/span> <span class=\"typ\">RSQLite<\/span>\n<span class=\"typ\">Loading<\/span><span class=\"pln\"> required <\/span><span class=\"kwd\">package<\/span><span class=\"pun\">:<\/span><span class=\"pln\"> gsubfn\n<\/span><span class=\"typ\">Loading<\/span><span class=\"pln\"> required <\/span><span class=\"kwd\">package<\/span><span class=\"pun\">:<\/span><span class=\"pln\"> proto\n<\/span><span class=\"typ\">Loading<\/span><span class=\"pln\"> required <\/span><span class=\"kwd\">package<\/span><span class=\"pun\">:<\/span><span class=\"pln\"> chron\n<\/span><span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">()<\/span>\n<span class=\"pun\">&lt;<\/span><span class=\"typ\">SQLiteConnection<\/span><span class=\"pun\">:(<\/span><span class=\"lit\">6480<\/span><span class=\"pun\">,<\/span><span class=\"lit\">0<\/span><span class=\"pun\">)&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> system<\/span><span class=\"pun\">.<\/span><span class=\"pln\">time<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"create index ai1 on DF1(a, b)\"<\/span><span class=\"pun\">))<\/span>\n<span class=\"typ\">Loading<\/span><span class=\"pln\"> required <\/span><span class=\"kwd\">package<\/span><span class=\"pun\">:<\/span><span class=\"pln\"> tcltk\n<\/span><span class=\"typ\">Loading<\/span> <span class=\"typ\">Tcl<\/span><span class=\"pun\">\/<\/span><span class=\"typ\">Tk<\/span> <span class=\"kwd\">interface<\/span> <span class=\"pun\">...<\/span> <span class=\"kwd\">done<\/span><span class=\"pln\">\n\u00a0 \u00a0user \u00a0system elapsed \n\u00a0 <\/span><span class=\"lit\">16.69<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">0.19<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">19.12<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> system<\/span><span class=\"pun\">.<\/span><span class=\"pln\">time<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"create index ai2 on DF2(a, b)\"<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\n\u00a0 \u00a0user \u00a0system elapsed \n\u00a0 <\/span><span class=\"lit\">16.60<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">0.03<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">17.48<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> system<\/span><span class=\"pun\">.<\/span><span class=\"pln\">time<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from main.DF1 natural join main.DF2\"<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\n\u00a0 \u00a0user \u00a0system elapsed \n\u00a0 \u00a0<\/span><span class=\"lit\">7.76<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">0.06<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">8.23<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">()<\/span><\/pre>\n<p>The sqldf statements above could also be done in one sqldf call like this:<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># define DF1 and DF2 as before<\/span>\n<span class=\"kwd\">set<\/span><span class=\"pun\">.<\/span><span class=\"pln\">seed<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nn <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"lit\">1000000<\/span><span class=\"pln\">\nDF1 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> sample<\/span><span class=\"pun\">(<\/span><span class=\"pln\">n<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> n<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> replace <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> \n\u00a0 \u00a0b <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> sample<\/span><span class=\"pun\">(<\/span><span class=\"lit\">4<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> n<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> replace <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> c1 <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> runif<\/span><span class=\"pun\">(<\/span><span class=\"pln\">n<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\nDF2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> sample<\/span><span class=\"pun\">(<\/span><span class=\"pln\">n<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> n<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> replace <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> \n\u00a0 \u00a0b <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> sample<\/span><span class=\"pun\">(<\/span><span class=\"lit\">4<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> n<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> replace <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> c2 <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> runif<\/span><span class=\"pun\">(<\/span><span class=\"pln\">n<\/span><span class=\"pun\">))<\/span>\n\n<span class=\"com\"># combine all sqldf calls from before into one call<\/span><span class=\"pln\">\n\nresult <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"create index ai1 on DF1(a, b)\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 <\/span><span class=\"str\">\"create index ai2 on DF2(a, b)\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 <\/span><span class=\"str\">\"select * from main.DF1 natural join main.DF2\"<\/span><span class=\"pun\">))<\/span><\/pre>\n<p>Note that if your data is so large that you need indexes it may be too large to store the database in memory. If you find its overflowing memory then use the <tt>dbname=<\/tt> sqldf argument, e.g. <tt>sqldf(c(\"create...\", \"create...\", \"select...\"), dbname = tempfile())<\/tt> so that it stores the intermediate results in an external database rather than memory.<\/p>\n<p><i>Note:<\/i> The index <tt>ai1<\/tt> is not actually used so we could have saved the time it took to create it, creating only <tt>ai2<\/tt>.<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">sqldf<\/span><span class=\"pun\">(<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"create index ai2 on DF2(a, b)\"<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"select * from DF1 natural join main.DF2\"<\/span><span class=\"pun\">))<\/span><\/pre>\n<p>Example 4j. Per Group Max and Min<\/p>\n<p>Note that the Date variable gets passed to SQLite as number of days since 1970-01-01 whereas SQLite uses an earlier origin so we add<tt>julianday('1970-01-01')<\/tt> to convert the origin of R&#8217;s <tt>\"Date\"<\/tt> class to SQLite&#8217;s origin. Note that the output column called <tt>Date<\/tt> is automatically converted to <tt>\"Date\"<\/tt> class by the sqldf heuristic because there is an input column that has the same name.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> URL <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"str\">\"http:\/\/ichart.finance.yahoo.com\/table.csv?s=GOOG&amp;a=07&amp;b=19&amp;c=2004&amp;d=03&amp;e=16&amp;f=2010&amp;g=d&amp;ignore=.csv\"<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF25 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">(<\/span><span class=\"pln\">URL<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> nrows <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">25<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF25$Date <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Date<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF25$Date<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select Date, a.High, a.Low, b.Close, a.Volume\n+ from (select max(Date) Date, min(Low) Low, max(High) High, sum(Volume) Volume\n+ from DF25 \n+ group by date(Date + julianday('1970-01-01'), 'start of month')\n+ ) as a join DF25 b using(Date)\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"typ\">Date<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">High<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"typ\">Low<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"typ\">Close<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Volume<\/span>\n<span class=\"lit\">1<\/span> <span class=\"lit\">2010<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span><span class=\"pun\">-<\/span><span class=\"lit\">31<\/span> <span class=\"lit\">588.28<\/span> <span class=\"lit\">539.70<\/span> <span class=\"lit\">567.12<\/span> <span class=\"lit\">51541600<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">2010<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span><span class=\"pun\">-<\/span><span class=\"lit\">16<\/span> <span class=\"lit\">597.84<\/span> <span class=\"lit\">549.63<\/span> <span class=\"lit\">550.15<\/span> <span class=\"lit\">41201900<\/span><\/pre>\n<p>and here is another shorter one that uses a trick of Magnus Hagander in the second Stackoverflow link below:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select \n+ max(Date) Date, \n+ max(High) High, \n+ min(Low) Low, \n+ max(100000 * Date + Close) % 100000 Close,\n+ sum(Volume) Volume\n+ from DF25 \n+ group by date(Date + julianday('1970-01-01'), 'start of month')\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"typ\">Date<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">High<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"typ\">Low<\/span> <span class=\"typ\">Close<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Volume<\/span>\n<span class=\"lit\">1<\/span> <span class=\"lit\">2010<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span><span class=\"pun\">-<\/span><span class=\"lit\">31<\/span> <span class=\"lit\">588.28<\/span> <span class=\"lit\">539.70<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">567<\/span> <span class=\"lit\">51541600<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">2010<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span><span class=\"pun\">-<\/span><span class=\"lit\">16<\/span> <span class=\"lit\">597.84<\/span> <span class=\"lit\">549.63<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"lit\">550<\/span> <span class=\"lit\">41201900<\/span><\/pre>\n<p>Also see <a href=\"http:\/\/www.xaprb.com\/blog\/2007\/03\/14\/how-to-find-the-max-row-per-group-in-sql-without-subqueries\/\" rel=\"nofollow\">this Xaprb link<\/a> for an approach without subqueries and for more discussion see <a href=\"http:\/\/stackoverflow.com\/questions\/121387\/sql-fetch-the-row-which-has-the-max-value-for-a-column\" rel=\"nofollow\">this stackoverflow link<\/a> and <a href=\"http:\/\/stackoverflow.com\/questions\/1140254\/postgresql-vlookup\" rel=\"nofollow\">this stackoverflow link<\/a>. The last link shows how to use analytical queries which are available in PostgreSQL &#8212; the PostgreSQL database, like SQLite and H2, is supported by sqldf.<\/p>\n<h2><a name=\"Example_5._Insert_Variables\"><\/a>Example 5. Insert Variables<\/h2>\n<p>Here is an example of inserting evaluated variables into a query using <a href=\"http:\/\/code.google.com\/p\/gsubfn\/\" rel=\"nofollow\">gsubfn<\/a> quasi-perl-style string interpolation. gsubfn is used by sqldf so its already loaded. Note that we must use the <tt>fn$<\/tt> prefix to invoke the interpolation functionality:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> minSL <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"lit\">7<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> limit <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"lit\">3<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> species <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"str\">\"virginica\"<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> fn$sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from iris where \\\"Sepal.Length\\\" &gt; $minSL and species = '$species' limit $limit\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">7.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.1<\/span><span class=\"pln\"> virginica\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">7.6<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6.6<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.1<\/span><span class=\"pln\"> virginica\n<\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">7.3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6.3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">1.8<\/span><span class=\"pln\"> virginica<\/span><\/pre>\n<h2><a name=\"Example_6._File_Input\"><\/a>Example 6. File Input<\/h2>\n<p>Note that there is a new command <tt>read.csv.sql<\/tt> which provides an alternate interface to the the approach discussed in this section. See Example 13 for that.<\/p>\n<p>sqldf normally deletes any database it creates after completion but the example sample code <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2010-October\/257270.html\" rel=\"nofollow\">at the bottom of this post<\/a> shows how to set up a database and read a file into it without having the database destroyed afterwards.<\/p>\n<p>sqldf will not only look for data frames used in the SQL statement but will also look for R objects of class <tt>\"file\"<\/tt>. For such objects it will directly import the associated file into the database without going through R allowing files that are larger than an R workspace to be handled and also providing for potential speed advantages. That is, if <tt>f &lt;- file(\"abc.csv\")<\/tt> is a file object and <tt>f<\/tt> is used as the table name in the sql statement then the file <tt>abc.csv<\/tt> is imported into the database as table <tt>f<\/tt>. With SQLite, the actual reading of the file into the database is done in a C routine in RSQLite so the file is transferred directly to the database without going through R. If the <tt>sqldf<\/tt> argument <tt>dbname<\/tt> is used then it specifies a filename (either existing or created by <tt>sqldf<\/tt> if not existing). That filename is used as a database (rather than memory) allowing larger files than physical memory. By using an appropriate <tt>where<\/tt> statement or a subset of column names a portion of the table can be retrieved into R even if the file itself is too large for R or for memory.<\/p>\n<p>There are some caveats. The RSQLite <tt>dbWriteTable<\/tt>\/<tt>sqliteImportFile<\/tt> routines that <tt>sqldf<\/tt> uses to transfer the file directly to the database are intended for speed thus they are not as flexible as <tt>read.table<\/tt>. Also they have slightly different defaults. The default for <tt>sep<\/tt> is <tt>file.format = list(sep = \",\")<\/tt>. If the first row of the file has one fewer component than subsequent ones then it assumes that <tt>file.format = list(header = TRUE, row.names = TRUE)<\/tt> and otherwise that <tt>file.format = list(header = FALSE, row.names = FALSE)<\/tt>. <tt>.csv<\/tt> file format is only partly supported &#8212; quotes are not regarded as special.<\/p>\n<p>In addition to the examples below there is an example <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2009-May\/199991.html\" rel=\"nofollow\">here<\/a> and another one with performance results <a href=\"http:\/\/www.cerebralmastication.com\/2009\/11\/loading-big-data-into-r\/\" rel=\"nofollow\">here<\/a>.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 6a.<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># test of file connections with sqldf<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># create test .csv file of just 3 records<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> write<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">head<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">3<\/span><span class=\"pun\">),<\/span> <span class=\"str\">\"iris3.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># look at contents of iris3.dat<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> readLines<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"iris3.dat\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]<\/span> <span class=\"str\">\"Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species\"<\/span>\n<span class=\"pun\">[<\/span><span class=\"lit\">2<\/span><span class=\"pun\">]<\/span> <span class=\"str\">\"1,5.1,3.5,1.4,0.2,setosa\"<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \n<\/span><span class=\"pun\">[<\/span><span class=\"lit\">3<\/span><span class=\"pun\">]<\/span> <span class=\"str\">\"2,4.9,3,1.4,0.2,setosa\"<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \n<\/span><span class=\"pun\">[<\/span><span class=\"lit\">4<\/span><span class=\"pun\">]<\/span> <span class=\"str\">\"3,4.7,3.2,1.3,0.2,setosa\"<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \n<\/span><span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># set up file connection<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> iris3 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> file<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"iris3.dat\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select * from iris3 where \"Sepal.Width\" &gt; 3'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.5<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"pun\">&gt;<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 6b.<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># similar but uses disk - useful if file were large<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># According to http:\/\/www.sqlite.org\/whentouse.html<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># SQLite can handle files up to several dozen gigabytes.<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># (Note in this case readTable and readTableIndex in R.utils<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># package or read.table from the base of R, setting the colClasses <\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># argument to \"NULL\" for columns you don't want read in, might be<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># alternatives.)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select * from iris3 where \"Sepal.Width\" &gt; 3'<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dbname <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> tempfile<\/span><span class=\"pun\">())<\/span><span class=\"pln\">\n\u00a0<\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.5<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n\n<\/span><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 6c.<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># with this format, header=TRUE needs to be specified<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> write<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">head<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">3<\/span><span class=\"pun\">),<\/span> <span class=\"str\">\"iris3a.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">,<\/span> \n<span class=\"pun\">+<\/span><span class=\"pln\"> \u00a0row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> iris3a <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> file<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"iris3a.dat\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from iris3a\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> file<\/span><span class=\"pun\">.<\/span><span class=\"pln\">format <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> list<\/span><span class=\"pun\">(<\/span><span class=\"pln\">header <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.5<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n\n<\/span><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 6d.<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># header can alternately be specified as object attribute<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> attr<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris3a<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"file.format\"<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> list<\/span><span class=\"pun\">(<\/span><span class=\"pln\">header <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from iris3a\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.5<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n\n<\/span><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 6e.<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># create a test file with all 150 records from iris<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># and select 4 records at random without reading entire file into R<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> write<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"iris150.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> iris150 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> file<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"iris150.dat\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from iris150 order by random(*) limit 4\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.5<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.5<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">1.7<\/span><span class=\"pln\"> virginica\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.8<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.1<\/span><span class=\"pln\"> \u00a0 \u00a0setosa\n<\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.6<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.6<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> virginica\n<\/span><span class=\"lit\">4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">7.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.8<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">1.9<\/span><span class=\"pln\"> virginica\n<\/span><span class=\"pun\">&gt;<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># or use read.csv.sql and its just one line<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"iris150.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sql <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"select * from file order by random(*) limit 4\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">3.3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">1.0<\/span><span class=\"pln\"> versicolor\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.8<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">1.0<\/span><span class=\"pln\"> versicolor\n<\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">7.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.8<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">1.9<\/span><span class=\"pln\"> \u00a0virginica\n<\/span><span class=\"lit\">4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.5<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.3<\/span><span class=\"pln\"> \u00a0 \u00a0 setosa<\/span><\/pre>\n<p>Example 6f. If our file has fixed width fields rather than delimited then we can still handle it if we parse the lines manually with substr:<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># write some test data to \"fixed\"<\/span>\n<span class=\"com\"># Field 1 has width of 1 column and field 2 has 4 columns<\/span><span class=\"pln\">\ncat<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"1 8.3\n210.3\n319.0\n416.0\n515.6\n719.8\n\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> file <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"fixed\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># get 3 random records using sqldf<\/span>\n<span class=\"kwd\">fixed<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> file<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"fixed\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nattr<\/span><span class=\"pun\">(<\/span><span class=\"kwd\">fixed<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"file.format\"<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> list<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\";\"<\/span><span class=\"pun\">)<\/span> <span class=\"com\"># ; can be any char not in file<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select substr(V1, 1, 1) f1, substr(V1, 2, 4) f2 from fixed order by random(*) limit 3\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>Another example of fixed width data is <a href=\"https:\/\/sites.google.com\/site\/timriffepersonal\/DemogBlog\/newformetrickforworkingwithbigishdatainr\" rel=\"nofollow\">here<\/a> (however, note that changing the sep needs to be done in the example in that link too).<\/p>\n<p>Example 6g. Defaults.<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># If first row has one fewer columns than subsequent rows then <\/span>\n<span class=\"com\"># header &lt;- row.names &lt;- TRUE is assumed as in example 6a; otherwise,<\/span>\n<span class=\"com\"># header &lt;- row.names &lt;- FALSE is assumed as shown here:<\/span>\n\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> write<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">head<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">3<\/span><span class=\"pun\">),<\/span> <span class=\"str\">\"iris3nohdr.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> col<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> readLines<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"iris3nohdr.dat\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]<\/span> <span class=\"str\">\"5.1,3.5,1.4,0.2,setosa\"<\/span> <span class=\"str\">\"4.9,3,1.4,0.2,setosa\"<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"str\">\"4.7,3.2,1.3,0.2,setosa\"<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from iris3nohdr\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0V1 \u00a0V2 \u00a0V3 \u00a0V4 \u00a0 \u00a0 V5\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">5.1<\/span> <span class=\"lit\">3.5<\/span> <span class=\"lit\">1.4<\/span> <span class=\"lit\">0.2<\/span><span class=\"pln\"> setosa\n<\/span><span class=\"lit\">2<\/span> <span class=\"lit\">4.9<\/span> <span class=\"lit\">3.0<\/span> <span class=\"lit\">1.4<\/span> <span class=\"lit\">0.2<\/span><span class=\"pln\"> setosa\n<\/span><span class=\"lit\">3<\/span> <span class=\"lit\">4.7<\/span> <span class=\"lit\">3.2<\/span> <span class=\"lit\">1.3<\/span> <span class=\"lit\">0.2<\/span><span class=\"pln\"> setosa<\/span><\/pre>\n<h2><a name=\"Example_7._Nested_Select\"><\/a>Example 7. Nested Select<\/h2>\n<p>For each species show the two rows with the largest sepal lengths:<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 7a.<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select * from iris i \n+ \u00a0 where rowid in \n+ \u00a0 \u00a0(select rowid from iris where Species = i.Species order by \"Sepal.Length\" desc limit 2)\n+ \u00a0 order by i.Species, i.\"Sepal.Length\" desc'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.8<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">4.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0 \u00a0 setosa\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">4.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.5<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.4<\/span><span class=\"pln\"> \u00a0 \u00a0 setosa\n<\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">7.0<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> versicolor\n<\/span><span class=\"lit\">4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">1.5<\/span><span class=\"pln\"> versicolor\n<\/span><span class=\"lit\">5<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">7.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.8<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.0<\/span><span class=\"pln\"> \u00a0virginica\n<\/span><span class=\"lit\">6<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">7.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.8<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">2.2<\/span><span class=\"pln\"> \u00a0virginica<\/span><\/pre>\n<p>Here is a similar example. In this one <tt>DF<\/tt> represents a time series whose values are in column <tt>x<\/tt> and whose times are dates in column <tt>tt<\/tt>. The times have gaps &#8212; in fact only every other day is present. The code below displays the first row at or past the 21st of the month for each year\/month. First we append year, month and day columns using <tt>month.day.year<\/tt> from the <tt>chron<\/tt> package and then do the computation using<tt>sqldf<\/tt>. (For a version of this using the <tt>zoo<\/tt> package rather than <tt>sqldf<\/tt> see: <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2007-November\/145925.html\" rel=\"nofollow\">https:\/\/stat.ethz.ch\/pipermail\/r-help\/2007-November\/145925.html<\/a>).<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 7b.<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\">#<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">chron<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">x <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">101<\/span><span class=\"pun\">:<\/span><span class=\"lit\">200<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> tt <\/span><span class=\"pun\">=<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Date<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"2000-01-01\"<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">+<\/span><span class=\"pln\"> seq<\/span><span class=\"pun\">(<\/span><span class=\"lit\">0<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> len <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">100<\/span><span class=\"pun\">,<\/span> <span class=\"kwd\">by<\/span> <span class=\"pun\">=<\/span> <span class=\"lit\">2<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> cbind<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> month<\/span><span class=\"pun\">.<\/span><span class=\"pln\">day<\/span><span class=\"pun\">.<\/span><span class=\"pln\">year<\/span><span class=\"pun\">(<\/span><span class=\"pln\">unclass<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF$tt<\/span><span class=\"pun\">)))<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from DF d\n+ \u00a0 where rowid in \n+ \u00a0 \u00a0(select rowid from DF \n+ \u00a0 \u00a0 \u00a0 where year = d.year and month = d.month and day &gt;= 21 limit 1)\n+ \u00a0 \u00a0order by tt\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 x \u00a0 \u00a0 \u00a0 \u00a0 tt \u00a0 \u00a0month \u00a0 \u00a0day \u00a0 \u00a0year\n<\/span><span class=\"lit\">1<\/span> <span class=\"lit\">111<\/span> <span class=\"lit\">2000<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">21<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 <\/span><span class=\"lit\">21<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">2000<\/span>\n<span class=\"lit\">2<\/span> <span class=\"lit\">127<\/span> <span class=\"lit\">2000<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span><span class=\"pun\">-<\/span><span class=\"lit\">22<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 <\/span><span class=\"lit\">22<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">2000<\/span>\n<span class=\"lit\">3<\/span> <span class=\"lit\">141<\/span> <span class=\"lit\">2000<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span><span class=\"pun\">-<\/span><span class=\"lit\">21<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0 <\/span><span class=\"lit\">21<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">2000<\/span>\n<span class=\"lit\">4<\/span> <span class=\"lit\">157<\/span> <span class=\"lit\">2000<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span><span class=\"pun\">-<\/span><span class=\"lit\">22<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4<\/span><span class=\"pln\"> \u00a0 \u00a0 <\/span><span class=\"lit\">22<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">2000<\/span>\n<span class=\"lit\">5<\/span> <span class=\"lit\">172<\/span> <span class=\"lit\">2000<\/span><span class=\"pun\">-<\/span><span class=\"lit\">05<\/span><span class=\"pun\">-<\/span><span class=\"lit\">22<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5<\/span><span class=\"pln\"> \u00a0 \u00a0 <\/span><span class=\"lit\">22<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">2000<\/span>\n<span class=\"lit\">6<\/span> <span class=\"lit\">187<\/span> <span class=\"lit\">2000<\/span><span class=\"pun\">-<\/span><span class=\"lit\">06<\/span><span class=\"pun\">-<\/span><span class=\"lit\">21<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6<\/span><span class=\"pln\"> \u00a0 \u00a0 <\/span><span class=\"lit\">21<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">2000<\/span><\/pre>\n<p>Here is another example of a nested select. We select each row of a for which st\/en overlaps with some st\/en of b.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 7c.<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\">#<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> a <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">textConnection<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"st en\n+ 1 4\n+ 11 14\n+ 3 4\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> header <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> b <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">textConnection<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"st en\n+ 2 5\n+ 3 6\n+ 30 44\"<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from a where \n+ (select count(*) from b where a.en &gt;= b.st and b.en &gt;= a.st) &gt; 0\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 st en\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">4<\/span>\n<span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">4<\/span><\/pre>\n<p>7d. Another example of a nested select with sqldf is shown <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2010-March\/231975.html\" rel=\"nofollow\">here<\/a><\/p>\n<h2><a name=\"Example_8._Specifying_File_Format\"><\/a>Example 8. Specifying File Format<\/h2>\n<p>When using file() as used as in Example 6 RSQLite reads in the first 50 lines to determine the column classes. What if they all have numbers in them but then later we start to see letters? In that case we will have to override its choice. Here are two ways:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># example example 8a - file.format attribute on file.object<\/span><span class=\"pln\">\n\nnumStr <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">character<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">100<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"pln\">numStr<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Hello\"<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\nwrite<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> file <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"~\/tmp.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nff <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> file<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"~\/tmp.csv\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\nattr<\/span><span class=\"pun\">(<\/span><span class=\"pln\">ff<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"file.format\"<\/span><span class=\"pun\">)<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> list<\/span><span class=\"pun\">(<\/span><span class=\"pln\">colClasses <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"character\"<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\n\ntail<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from ff\"<\/span><span class=\"pun\">))<\/span>\n\n\n<span class=\"com\"># example 8b - using file.format argument<\/span><span class=\"pln\">\n\nnumStr <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"pln\">character<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">100<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"pln\">numStr<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"Hello\"<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\nwrite<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> file <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"~\/tmp.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nff <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> file<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"~\/tmp.csv\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\ntail<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from ff\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\">\n\u00a0file<\/span><span class=\"pun\">.<\/span><span class=\"pln\">format <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> list<\/span><span class=\"pun\">(<\/span><span class=\"pln\">colClasses <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"character\"<\/span><span class=\"pun\">))))<\/span><\/pre>\n<h2><a name=\"Example_9.__Working_with_Databases\"><\/a>Example 9. Working with Databases<\/h2>\n<p>sqldf is usually used to operate on data frames but it can be used to store a table in a database and repeatedly query it in subsequent sqldf statements (although in that case you might be better off just using RSQLite or other database directly). There are two ways to do this. In this Example section we show how to do it using the fact that if you specify the database explicitly then it does not delete the database at the end and if you create a table explicitly using create table then it does not delete the table (however, note that that will result in duplicate tables in the database so it will take up twice as much space as one table). A second way to do this is to use persistent connections as shown in the Example section after this one.<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># create new empty database called mydb<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"attach 'mydb' as new\"<\/span><span class=\"pun\">)<\/span> \n\n<span class=\"com\"># create a new table, mytab, in the new database<\/span>\n<span class=\"com\"># Note that sqldf does not delete tables created from create.<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"create table mytab as select * from BOD\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dbname <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"mydb\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># shows its still there<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from mytab\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dbname <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"mydb\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># read a file into the mydb data base using read.csv.sql without deleting it<\/span>\n<span class=\"com\">#<\/span>\n<span class=\"com\"># 1. First create a test file.<\/span>\n<span class=\"com\"># 2. Then read it into the mydb database we created using the sqldf(\"attach...\") above.<\/span>\n<span class=\"com\"># \u00a0 \u00a0Since sqldf automatically cleans up after itself we hide <\/span>\n<span class=\"com\"># \u00a0 \u00a0the table creation in an sql statement so table is not deleted.<\/span>\n<span class=\"com\"># 3. Finally list the table names in the database.<\/span><span class=\"pln\">\n\u00a0\nwrite<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">BOD<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> file <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"~\/tmp.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nread<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"~\/tmp.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sql <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"create table mytab as select * from file\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 dbname <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"mydb\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from sqlite_master\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dbname <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"mydb\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<h2><a name=\"Example_10._Persistent_Connections\"><\/a>Example 10. Persistent Connections<\/h2>\n<p>These three examples show the use of persistent connections in sqldf. This would be used when one has a large database that one wants to store and then make queries from so that one does not have to reload it on each execution of sqldf. (Note that if one just needs a series of sql statements ending in a single query an alternative would be just to use a vector of sql statements in a single sqldf call.)<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 10a.<\/span>\n<span class=\"pun\">&gt;<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># create test .csv file of just 3 records (same as example 6)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> write<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">head<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">3<\/span><span class=\"pun\">),<\/span> <span class=\"str\">\"iris3.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># set up file connection<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> iris3 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> file<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"iris3.dat\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># creates connection so in memory database persists after sqldf call<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">()<\/span> \n<span class=\"pun\">&lt;<\/span><span class=\"typ\">SQLiteConnection<\/span><span class=\"pun\">:(<\/span><span class=\"lit\">7384<\/span><span class=\"pun\">,<\/span><span class=\"lit\">62<\/span><span class=\"pun\">)&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># uses connection just created<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select * from iris3 where \"Sepal.Width\" &gt; 3'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.5<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"pun\">&gt;<\/span> <span class=\"com\"># we now have iris3 variable in R workspace and an iris3 table<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># so ensure sqldf uses the one in the main database by writing<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># main.iris3. \u00a0(Another possibility here would have been to<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># delete the iris3 variable from the R workspace to avoid the<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># ambiguity -- in that case one could just write iris3 instead<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># of main.iris3.)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select * from main.iris3 where \"Sepal.Width\" = 3'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># close<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">()<\/span><span class=\"pln\">\nNULL\n\n<\/span><span class=\"pun\">&gt;<\/span> <span class=\"com\"># Example 10b.<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\">#<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># Here is another way to do example 10a. \u00a0We use the same iris3,<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># iris3.dat and sqldf development version as above. \u00a0<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># We grab connection explicitly, set up the database using sqldf and then <\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># for the second call we call dbGetQuery from RSQLite. \u00a0<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># In that case we don't need to qualify iris3 as main.iris3 since<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># RSQLite would not understand R variables anyways so there is no <\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># ambiguity.<\/span>\n\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> con <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">()<\/span> \n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># uses connection just created<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select * from iris3 where \"Sepal.Width\" &gt; 3'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">5.1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.5<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.7<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3.2<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"pun\">&gt;<\/span><span class=\"pln\"> dbGetQuery<\/span><span class=\"pun\">(<\/span><span class=\"pln\">con<\/span><span class=\"pun\">,<\/span> <span class=\"str\">'select * from iris3 where \"Sepal.Width\" = 3'<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 <\/span><span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Sepal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Length<\/span> <span class=\"typ\">Petal<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Width<\/span> <span class=\"typ\">Species<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">4.9<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">1.4<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">0.2<\/span><span class=\"pln\"> \u00a0setosa\n<\/span><span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># close<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">()<\/span><span class=\"pln\">\nNULL<\/span><\/pre>\n<p>Here is an example of reading a csv file using read.csv.sql and then reading it again using a persistent connection:<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># Example 10c.<\/span><span class=\"pln\">\n\nwrite<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"iris.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\nsqldf<\/span><span class=\"pun\">()<\/span><span class=\"pln\">\nread<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"iris.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sql <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"select count(*) from file\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># now re-read it from the sqlite database<\/span><span class=\"pln\">\ndd <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from file\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># now close the connection and destroy the database<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">()<\/span><\/pre>\n<h2><a name=\"Example_11._Between_and_Alternatives\"><\/a>Example 11. Between and Alternatives<\/h2>\n<pre class=\"prettyprint\"><span class=\"com\"># example thanks to Michael Rehberg<\/span>\n<span class=\"com\">#<\/span>\n<span class=\"com\"># build sample dataframes<\/span><span class=\"pln\">\nseqdf <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">thetime<\/span><span class=\"pun\">=<\/span><span class=\"pln\">seq<\/span><span class=\"pun\">(<\/span><span class=\"lit\">100<\/span><span class=\"pun\">,<\/span><span class=\"lit\">225<\/span><span class=\"pun\">,<\/span><span class=\"lit\">5<\/span><span class=\"pun\">),<\/span><span class=\"pln\">thevalue<\/span><span class=\"pun\">=<\/span><span class=\"pln\">factor<\/span><span class=\"pun\">(<\/span><span class=\"pln\">letters<\/span><span class=\"pun\">))<\/span><span class=\"pln\">\nboundsdf <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">thestart<\/span><span class=\"pun\">=<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">110<\/span><span class=\"pun\">,<\/span><span class=\"lit\">160<\/span><span class=\"pun\">,<\/span><span class=\"lit\">200<\/span><span class=\"pun\">),<\/span><span class=\"pln\">theend<\/span><span class=\"pun\">=<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">130<\/span><span class=\"pun\">,<\/span><span class=\"lit\">180<\/span><span class=\"pun\">,<\/span><span class=\"lit\">220<\/span><span class=\"pun\">),<\/span><span class=\"pln\">groupID<\/span><span class=\"pun\">=<\/span><span class=\"pln\">c<\/span><span class=\"pun\">(<\/span><span class=\"lit\">555<\/span><span class=\"pun\">,<\/span><span class=\"lit\">666<\/span><span class=\"pun\">,<\/span><span class=\"lit\">777<\/span><span class=\"pun\">))<\/span>\n\n<span class=\"com\"># run the query using two inequalities<\/span><span class=\"pln\">\ntestquery_1 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select seqdf.thetime, seqdf.thevalue, boundsdf.groupID \nfrom seqdf left join boundsdf on (seqdf.thetime &lt;= boundsdf.theend) and (seqdf.thetime &gt;= boundsdf.thestart)\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># run the same query using 'between...and' clause<\/span><span class=\"pln\">\ntestquery_2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select seqdf.thetime, seqdf.thevalue, boundsdf.groupID \nfrom seqdf LEFT JOIN boundsdf ON (seqdf.thetime BETWEEN boundsdf.thestart AND boundsdf.theend)\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<h2><a name=\"Example_12._Combine_two_files_in_permanent_database\"><\/a>Example 12. Combine two files in permanent database<\/h2>\n<p>When we issue a series of normal <tt>sqldf<\/tt> statements after each one sqldf automatically removes any tables and databases it creates in that statement; however, it does not know about ones that <tt>sqlite<\/tt> creates so a database created using <tt>attach<\/tt> and the tables created using <tt>create table<\/tt> won&#8217;t be deleted.<\/p>\n<p>Also if <tt>sqldf<\/tt> is used without the <tt>x=<\/tt> argument (omitting x= denotes the opening of a persistent connection) then objects created in the database including those by <tt>sqldf<\/tt> and <tt>sqlite<\/tt> are not deleted when the persistent connection is destroyed by the next <tt>sqldf<\/tt> statement with no <tt>x=<\/tt>argument.<\/p>\n<p>If we have forgetten whether you have a connection open or not we can check either of these:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">dbListConnections<\/span><span class=\"pun\">(<\/span><span class=\"typ\">SQLite<\/span><span class=\"pun\">())<\/span> <span class=\"com\"># from DBI<\/span><span class=\"pln\">\n\ngetOption<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"sqldf.connection\"<\/span><span class=\"pun\">)<\/span> <span class=\"com\"># set by sqldf<\/span><\/pre>\n<p>Here is an example that illustrates part of the above. See the prior examples for more.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"com\"># set up some test data<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> write<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">head<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">3<\/span><span class=\"pun\">),<\/span> <span class=\"str\">\"irishead.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> write<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">tail<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris<\/span><span class=\"pun\">,<\/span> <span class=\"lit\">3<\/span><span class=\"pun\">),<\/span> <span class=\"str\">\"iristail.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># create new empty database called mydb<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"attach 'mydb' as new\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> \nNULL\n<\/span><span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> irishead <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> file<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"irishead.dat\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> iristail <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> file<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"iristail.dat\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># read tables into mydb<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select count(*) from irishead\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dbname <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"mydb\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 count<\/span><span class=\"pun\">(*)<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">3<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select count(*) from iristail\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dbname <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"mydb\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 count<\/span><span class=\"pun\">(*)<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">3<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># get count of all records from union<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'select count(*) from (select * from main.irishead \n+ union \n+ select * from main.iristail)'<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dbname <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"mydb\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 count<\/span><span class=\"pun\">(*)<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0<\/span><span class=\"lit\">6<\/span><\/pre>\n<h2><a name=\"Example_13._read.csv.sql_and_read.csv2.sql\"><\/a>Example 13. read.csv.sql and read.csv2.sql<\/h2>\n<p><tt>read.csv.sql<\/tt> is an interface to <tt>sqldf<\/tt> that works like <tt>read.csv<\/tt> in R except that it also provides an <tt>sql=<\/tt> argument and not all of the other arguments of <tt>read.csv<\/tt> are supported. It uses (1) SQLite&#8217;s import facility via RSQLite to read the input file into a temporary disk-based SQLite database which is created on the fly. (2) Then it uses the provided SQL statement to read the table so created into R. As the first step imports the data directly into SQLite without going through R it can handle larger files than R itself can handle as long as the SQL statement filters it to a size that R can handle. Here is Example 6c redone using this facility:<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># Example 13a.<\/span><span class=\"pln\">\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\nwrite<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"iris.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\niris<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"iris.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 sql <\/span><span class=\"pun\">=<\/span> <span class=\"str\">'select * from file where \"Sepal.Length\" &gt; 5'<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># Example 13b. \u00a0read.csv2.sql. \u00a0Commas are decimals and ; is sep.<\/span><span class=\"pln\">\n\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"typ\">Lines<\/span> <span class=\"pun\">&lt;-<\/span> <span class=\"str\">\"Sepal.Length;Sepal.Width;Petal.Length;Petal.Width;Species\n5,1;3,5;1,4;0,2;setosa\n4,9;3;1,4;0,2;setosa\n4,7;3,2;1,3;0,2;setosa\n4,6;3,1;1,5;0,2;setosa\n\"<\/span><span class=\"pln\">\ncat<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Lines<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> file <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"iris2.csv\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\niris<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv2<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"iris2.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sql <\/span><span class=\"pun\">=<\/span> <span class=\"str\">'select * from file where \"Sepal.Length\" &gt; 5'<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># Example 13c. Use of filter= to process fixed field widths.<\/span>\n\n<span class=\"com\"># This example assumes gawk is available for use as a filter:<\/span>\n<span class=\"com\"># http:\/\/www.icewalkers.com\/Linux\/Software\/514530\/Gawk.html<\/span>\n<span class=\"com\"># http:\/\/gnuwin32.sourceforge.net\/packages\/gawk.htm<\/span><span class=\"pln\">\n\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\ncat<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"112333\n123456\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> file <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"fixed.dat\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\ncat<\/span><span class=\"pun\">(<\/span><span class=\"str\">'BEGIN { FIELDWIDTHS = \"2 1 3\"; OFS = \",\"; print \"A,B,C\" }\n{ $1 = $1; print }'<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> file <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"fixed.awk\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># the following worked on Windows Vista. \u00a0One user told me that it only worked if he<\/span>\n<span class=\"com\"># omitted the eol= argument so try it both ways on your system and use the way that<\/span>\n<span class=\"com\"># works for your system.<\/span>\n\n<span class=\"kwd\">fixed<\/span> <span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"fixed.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> eol <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"\\n\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> filter <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"gawk -f fixed.awk\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># Example 13d. \u00a0Read a csv file into the database but do not drop the database or table<\/span>\n\n<span class=\"com\"># create test file<\/span><span class=\"pln\">\nwrite<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">iris<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"iris.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># create an empty database (can skip this step if database already exists)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"attach mytestdb as new\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># read into table called iris in the mytestdb sqlite database<\/span><span class=\"pln\">\nread<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"iris.csv\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sql <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"create table main.iris as select * from file\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dbname <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"mytestdb\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># look at first three lines<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select * from main.iris limit 3\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> dbname <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"mytestdb\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># example 13e. \u00a0Read in only column j of a csv file where j may vary.<\/span><span class=\"pln\">\n\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># create test data file<\/span><span class=\"pln\">\nnms <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> names<\/span><span class=\"pun\">(<\/span><span class=\"pln\">anscombe<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nwrite<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">anscombe<\/span><span class=\"pun\">,<\/span> <span class=\"str\">\"anscombe.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sep <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\",\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> quote <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> \n\u00a0 \u00a0 \u00a0 \u00a0 row<\/span><span class=\"pun\">.<\/span><span class=\"pln\">names <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> FALSE<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\nj <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"lit\">2<\/span><span class=\"pln\">\nDF2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> fn$read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">csv<\/span><span class=\"pun\">.<\/span><span class=\"pln\">sql<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"anscombe.dat\"<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> sql <\/span><span class=\"pun\">=<\/span> <span class=\"str\">\"select `nms[j]` from file\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>Also see this <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2010-November\/260931.html\" rel=\"nofollow\">example<\/a> and this further <a href=\"http:\/\/stackoverflow.com\/questions\/6966723\/how-to-allocate-append-a-large-column-of-date-objects-to-a-data-frame\/6966771#6966771\" rel=\"nofollow\">example<\/a>. The latter illustrates the use of the <tt>method=<\/tt> argument.<\/p>\n<h2><a name=\"Example_14._Use_of_spatialite_library_functions\"><\/a>Example 14. Use of spatialite library functions<\/h2>\n<p><strong>This example needs to be revised as automatic loading of spatialite has been removed from sqldf and replaced with the functions in RSQLite.extfuns which are loaded instead<\/strong><\/p>\n<p>This example will only work if spatialite-1.dll is on your PATH. It shows accessing a function in that dll. Other than placing it on your PATH there is no other setup needed. (Note that libspatialite-1.dll is only looked up the first time sqldf runs in a session so you should be sure that it has been put there before starting sqldf.)<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># stddev_pop is a function in spatialite library similar to sd in R<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># Note bug: spatialite has stddev_pop and stddev_samp reversed and ditto for var_pop and var_samp. \u00a0More on bug at:<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"com\"># http:\/\/groups.google.com\/group\/spatialite-users\/msg\/182f1f629c922607<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select avg(demand), stddev_pop(demand) from BOD\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 avg<\/span><span class=\"pun\">(<\/span><span class=\"pln\">demand<\/span><span class=\"pun\">)<\/span><span class=\"pln\"> stddev_pop<\/span><span class=\"pun\">(<\/span><span class=\"pln\">demand<\/span><span class=\"pun\">)<\/span>\n<span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0 \u00a0<\/span><span class=\"lit\">14.83333<\/span><span class=\"pln\"> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span class=\"lit\">4.630623<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> c<\/span><span class=\"pun\">(<\/span><span class=\"pln\">mean<\/span><span class=\"pun\">(<\/span><span class=\"pln\">BOD$demand<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> sd<\/span><span class=\"pun\">(<\/span><span class=\"pln\">BOD$demand<\/span><span class=\"pun\">))<\/span>\n<span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]<\/span> <span class=\"lit\">14.833333<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">4.630623<\/span><\/pre>\n<h2><a name=\"Example_15._Use_of_RSQLite.extfuns_library_functions\"><\/a>Example 15. Use of RSQLite.extfuns library functions<\/h2>\n<p>The RSQLite.extfuns are automatically loaded (as sqldf now depends on the <a href=\"http:\/\/cran.r-project.org\/web\/packages\/RSQLite.extfuns\/index.html\" rel=\"nofollow\">RSQLite.extfuns<\/a> R package which includes Liam Healy&#8217;s extension functions for SQLite). In addition to all the <a href=\"http:\/\/www.sqlite.org\/lang_corefunc.html\" rel=\"nofollow\">core functions<\/a>, <a href=\"http:\/\/www.sqlite.org\/lang_datefunc.html\" rel=\"nofollow\">date functions<\/a> and <a href=\"http:\/\/www.sqlite.org\/lang_aggfunc.html\" rel=\"nofollow\">aggregate functions<\/a> that SQLite itself provides, the following extension functions are available for use within SQL select statements: <strong>Math:<\/strong> acos, asin, atan, atn2, atan2, acosh, asinh, atanh, difference, degrees, radians, cos, sin, tan, cot, cosh, sinh, tanh, coth, exp, log, log10, power, sign, sqrt, square, ceil, floor, pi. <strong>String:<\/strong> replicate, charindex, leftstr, rightstr, ltrim, rtrim, trim, replace, reverse, proper, padl, padr, padc, strfilter. <strong>Aggregate:<\/strong> stdev, variance, mode, median, lower_quartile, upper_quartile. See the bottom of <a href=\"http:\/\/www.sqlite.org\/contrib\/\" rel=\"nofollow\">http:\/\/www.sqlite.org\/contrib\/<\/a> for more info on these extension functions.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select avg(demand) mean, variance(demand) var from BOD\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0 mean \u00a0 \u00a0 \u00a0<\/span><span class=\"kwd\">var<\/span>\n<span class=\"lit\">1<\/span> <span class=\"lit\">14.83333<\/span> <span class=\"lit\">21.44267<\/span>\n<span class=\"pun\">&gt;<\/span> <span class=\"kwd\">var<\/span><span class=\"pun\">(<\/span><span class=\"pln\">BOD$demand<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">]<\/span> <span class=\"lit\">21.44267<\/span><\/pre>\n<h2><a name=\"Example_16._Moving_Average\"><\/a>Example 16. Moving Average<\/h2>\n<p>This is a simplified version of the example in this <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2010-August\/249996.html\" rel=\"nofollow\">r-help post<\/a>. Here we compute the moving average of x for the 3rd to 9th preceding values of each date performing it separately for each illness.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span> <span class=\"typ\">Lines<\/span><span class=\"pln\"> \u00a0 <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"str\">\"date \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 illness x\n+ \u00a0 \u00a02006\/01\/01 \u00a0 \u00a0DERM 319\n+ \u00a0 \u00a02006\/01\/02 \u00a0 \u00a0DERM 388\n+ \u00a0 \u00a02006\/01\/03 \u00a0 \u00a0DERM 336\n+ \u00a0 \u00a02006\/01\/04 \u00a0 \u00a0DERM 255\n+ \u00a0 \u00a02006\/01\/05 \u00a0 \u00a0DERM 177\n+ \u00a0 \u00a02006\/01\/06 \u00a0 \u00a0DERM 377\n+ \u00a0 \u00a02006\/01\/07 \u00a0 \u00a0DERM 113\n+ \u00a0 \u00a02006\/01\/08 \u00a0 \u00a0DERM 253\n+ \u00a0 \u00a02006\/01\/09 \u00a0 \u00a0DERM 316\n+ \u00a0 \u00a02006\/01\/10 \u00a0 \u00a0DERM 187\n+ \u00a0 \u00a02006\/01\/11 \u00a0 \u00a0DERM 292\n+ \u00a0 \u00a02006\/01\/12 \u00a0 \u00a0DERM 275\n+ \u00a0 \u00a02006\/01\/13 \u00a0 \u00a0DERM 355\n+ \u00a0 \u00a02006\/01\/01 \u00a0 \u00a0FEVER 3190\n+ \u00a0 \u00a02006\/01\/02 \u00a0 \u00a0FEVER 3880\n+ \u00a0 \u00a02006\/01\/03 \u00a0 \u00a0FEVER 3360\n+ \u00a0 \u00a02006\/01\/04 \u00a0 \u00a0FEVER 2550\n+ \u00a0 \u00a02006\/01\/05 \u00a0 \u00a0FEVER 1770\n+ \u00a0 \u00a02006\/01\/06 \u00a0 \u00a0FEVER 3770\n+ \u00a0 \u00a02006\/01\/07 \u00a0 \u00a0FEVER 1130\n+ \u00a0 \u00a02006\/01\/08 \u00a0 \u00a0FEVER 2530\n+ \u00a0 \u00a02006\/01\/09 \u00a0 \u00a0FEVER 3160\n+ \u00a0 \u00a02006\/01\/10 \u00a0 \u00a0FEVER 1870\n+ \u00a0 \u00a02006\/01\/11 \u00a0 \u00a0FEVER 2920\n+ \u00a0 \u00a02006\/01\/12 \u00a0 \u00a0FEVER 2750\n+ \u00a0 \u00a02006\/01\/13 \u00a0 \u00a0FEVER 3550\"<\/span>\n<span class=\"pun\">&gt;<\/span> \n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> read<\/span><span class=\"pun\">.<\/span><span class=\"pln\">table<\/span><span class=\"pun\">(<\/span><span class=\"pln\">textConnection<\/span><span class=\"pun\">(<\/span><span class=\"typ\">Lines<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> header <\/span><span class=\"pun\">=<\/span><span class=\"pln\"> TRUE<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> DF$date <\/span><span class=\"pun\">&lt;-<\/span> <span class=\"kwd\">as<\/span><span class=\"pun\">.<\/span><span class=\"typ\">Date<\/span><span class=\"pun\">(<\/span><span class=\"pln\">DF$date<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select\n+ \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0t1.date,\n+ \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0avg(t2.x) mean,\n+ \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0date(min(t2.date) * 24 * 60 * 60, 'unixepoch') fromdate,\n+ \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0date(max(t2.date) * 24 * 60 * 60, 'unixepoch') todate,\n+ \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0max(t2.illness) illness\n+ \u00a0 \u00a0 \u00a0 \u00a0from \u00a0DF t1, DF t2\n+ \u00a0 \u00a0 \u00a0 \u00a0where julianday(t1.date) between julianday(t2.date) + 3 and\n+ julianday(t2.date) + 9\n+ \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0and t1.illness = t2.illness\n+ \u00a0 \u00a0 \u00a0 \u00a0group by t1.illness, t1.date\n+ \u00a0 \u00a0 \u00a0 \u00a0order by t1.illness, t1.date\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0date \u00a0 \u00a0 \u00a0mean \u00a0 fromdate \u00a0 \u00a0 todate illness\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">319.0000<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">05<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">353.5000<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">06<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">347.6667<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">4<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">07<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">324.5000<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">5<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">295.0000<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">05<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">6<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">09<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">308.6667<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">06<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">7<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">10<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">280.7143<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">07<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">8<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">11<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">271.2857<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">9<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">12<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">261.0000<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">09<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">10<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">13<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">239.7143<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">10<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">11<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span> <span class=\"lit\">3190.0000<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">12<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">05<\/span> <span class=\"lit\">3535.0000<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">13<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">06<\/span> <span class=\"lit\">3476.6667<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">14<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">07<\/span> <span class=\"lit\">3245.0000<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">15<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span> <span class=\"lit\">2950.0000<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">05<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">16<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">09<\/span> <span class=\"lit\">3086.6667<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">06<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">17<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">10<\/span> <span class=\"lit\">2807.1429<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">07<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">18<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">11<\/span> <span class=\"lit\">2712.8571<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">19<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">12<\/span> <span class=\"lit\">2610.0000<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">09<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">20<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">13<\/span> <span class=\"lit\">2397.1429<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">10<\/span><span class=\"pln\"> \u00a0 FEVER<\/span><\/pre>\n<p>Because of the date processing this is a bit more conveniently done in H2 with its support of date class. Using the same <tt>DF<\/tt> that we just defined. Note that SQL functions like AVG and MIN must be written in upper case when using H2.<\/p>\n<pre class=\"prettyprint\"><span class=\"pun\">&gt;<\/span><span class=\"pln\"> library<\/span><span class=\"pun\">(<\/span><span class=\"pln\">RH2<\/span><span class=\"pun\">)<\/span>\n<span class=\"pun\">&gt;<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select\n+ \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0t1.date,\n+ \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0AVG(t2.x) mean,\n+ \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0MIN(t2.date) fromdate,\n+ \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0MAX(t2.date) todate,\n+ \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0t2.illness illness\n+ \u00a0 \u00a0 \u00a0 \u00a0from \u00a0DF t1, DF t2\n+ \u00a0 \u00a0 \u00a0 \u00a0where t1.date between t2.date + 3 and t2.date + 9\n+ \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0and t1.illness = t2.illness\n+ \u00a0 \u00a0 \u00a0 \u00a0group by t1.illness, t1.date\n+ \u00a0 \u00a0 \u00a0 \u00a0order by t1.illness, t1.date\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0date mean \u00a0 fromdate \u00a0 \u00a0 todate illness\n<\/span><span class=\"lit\">1<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">319<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">2<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">05<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">353<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">3<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">06<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">347<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">4<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">07<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">324<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">5<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">295<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">05<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">6<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">09<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">308<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">06<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">7<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">10<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">280<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">07<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">8<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">11<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">271<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">9<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">12<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">261<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">09<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">10<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">13<\/span><span class=\"pln\"> \u00a0<\/span><span class=\"lit\">239<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">10<\/span><span class=\"pln\"> \u00a0 \u00a0DERM\n<\/span><span class=\"lit\">11<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span> <span class=\"lit\">3190<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">12<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">05<\/span> <span class=\"lit\">3535<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">13<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">06<\/span> <span class=\"lit\">3476<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">14<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">07<\/span> <span class=\"lit\">3245<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">15<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span> <span class=\"lit\">2950<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">05<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">16<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">09<\/span> <span class=\"lit\">3086<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">06<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">17<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">10<\/span> <span class=\"lit\">2807<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">07<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">18<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">11<\/span> <span class=\"lit\">2712<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">02<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">08<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">19<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">12<\/span> <span class=\"lit\">2610<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">03<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">09<\/span><span class=\"pln\"> \u00a0 FEVER\n<\/span><span class=\"lit\">20<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">13<\/span> <span class=\"lit\">2397<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">04<\/span> <span class=\"lit\">2006<\/span><span class=\"pun\">-<\/span><span class=\"lit\">01<\/span><span class=\"pun\">-<\/span><span class=\"lit\">10<\/span><span class=\"pln\"> \u00a0 FEVER<\/span><\/pre>\n<p>Another example which varies somewhat from a strict moving average can be found <a href=\"https:\/\/stat.ethz.ch\/pipermail\/r-help\/2011-June\/280081.html\" rel=\"nofollow\">in this post<\/a>.<\/p>\n<h2><a name=\"Example_17._Lag\"><\/a>Example 17. Lag<\/h2>\n<p>The following example contributed by S\u00f8ren H\u00f8jsgaard shows how to lag a column.<\/p>\n<pre class=\"prettyprint\"><span class=\"com\">## Create a lagged variable for grouped data<\/span>\n<span class=\"com\">## -----------------------------------------<\/span>\n<span class=\"com\"># Meaning that in the i'th row we not only have y[i] but also y[i-1].<\/span>\n<span class=\"com\"># This is done on a groupwise basis<\/span><span class=\"pln\">\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n<span class=\"kwd\">set<\/span><span class=\"pun\">.<\/span><span class=\"pln\">seed<\/span><span class=\"pun\">(<\/span><span class=\"lit\">123<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nDF <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data<\/span><span class=\"pun\">.<\/span><span class=\"pln\">frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">id<\/span><span class=\"pun\">=<\/span><span class=\"pln\">rep<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">2<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> each<\/span><span class=\"pun\">=<\/span><span class=\"lit\">5<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> tvar<\/span><span class=\"pun\">=<\/span><span class=\"pln\">rep<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">5<\/span><span class=\"pun\">,<\/span><span class=\"lit\">2<\/span><span class=\"pun\">),<\/span><span class=\"pln\"> y<\/span><span class=\"pun\">=<\/span><span class=\"pln\">rnorm<\/span><span class=\"pun\">(<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">10<\/span><span class=\"pun\">))<\/span>\n<span class=\"com\"># Data with lagged variable added<\/span><span class=\"pln\">\nBB <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\">\n\u00a0sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select A.id, A.tvar, A.y, B.y as lag\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0from DF as A join DF as B\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0where A.rowid-1 = B.rowid and A.id=B.id\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0order by A.id, A.tvar\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"com\"># Merge with original data:<\/span><span class=\"pln\">\nDD <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\">\n\u00a0sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select DF.*, BB.lag\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0from DF left join BB\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0on DF.id=BB.id and DF.tvar=BB.tvar\"<\/span><span class=\"pun\">)<\/span>\n<span class=\"com\"># Do it all in one step:<\/span><span class=\"pln\">\nDD <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\">\n\u00a0sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select DF.*, BB.lag\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0from DF left join\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0(\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0select A.id, A.tvar, A.y, B.y as lag\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0from DF as A join DF as B\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0where A.rowid-1 = B.rowid and A.id=B.id\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0order by A.id, A.tvar\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0) as BB\n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0on DF.id=BB.id and DF.tvar=BB.tvar\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>In PostgreSQL&#8217;s <a href=\"http:\/\/developer.postgresql.org\/pgdocs\/postgres\/tutorial-window.html\" rel=\"nofollow\">window<\/a> <a href=\"http:\/\/developer.postgresql.org\/pgdocs\/postgres\/functions-window.html\" rel=\"nofollow\">functions<\/a> (similar to R&#8217;s <tt>ave<\/tt> function) makes reference to other rows particularly easy. Below we repeat the SQLite example in PostgreSQL (except that the following fills with NA):<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># Be sure PostgreSQL is installed and running. \u00a0<\/span><span class=\"pln\">\n\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"typ\">RPostgreSQL<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"select *, lag(y) over (partition by id order by tvar) from DF\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<h2><a name=\"Example_17._MySQL_Schema_Information\"><\/a>Example 17. MySQL Schema Information<\/h2>\n<pre class=\"prettyprint\"><span class=\"pln\">library<\/span><span class=\"pun\">(<\/span><span class=\"typ\">RMySQL<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"show databases\"<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"show tables\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>The following SQL statements to query the MySQL table schemas are taken from the <a href=\"http:\/\/chrisladroue.com\/2012\/03\/a-graphical-overview-of-your-mysql-database\/\" rel=\"nofollow\">blog of Christophe Ladroue<\/a>:<\/p>\n<pre class=\"prettyprint\"><span class=\"pln\">library<\/span><span class=\"pun\">(<\/span><span class=\"typ\">RMySQL<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\nlibrary<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># list each schema and its length<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"SELECT TABLE_SCHEMA,SUM(DATA_LENGTH) SCHEMA_LENGTH \n\u00a0 \u00a0 \u00a0 \u00a0FROM information_schema.TABLES \n\u00a0 \u00a0 \u00a0 \u00a0WHERE TABLE_SCHEMA!='information_schema' \n\u00a0 \u00a0 \u00a0 \u00a0GROUP BY TABLE_SCHEMA\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># list each table in each schema and some info about it<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"SELECT TABLE_SCHEMA,TABLE_NAME,TABLE_ROWS,DATA_LENGTH \n\u00a0 \u00a0 \u00a0 \u00a0FROM information_schema.TABLES \n\u00a0 \u00a0 \u00a0 \u00a0WHERE TABLE_SCHEMA!='information_schema'\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<p>The following SQL statement to query the MySQL table schemas are taken from <a href=\"http:\/\/www.mysqlperformanceblog.com\/2008\/03\/17\/researching-your-mysql-table-sizes\/\" rel=\"nofollow\">the MySQL Performance Blog<\/a>:<\/p>\n<pre class=\"prettyprint\"><span class=\"com\"># Find total number of tables, rows, total data in index size<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"SELECT count(*) tables,\n\u00a0 concat(round(sum(table_rows)\/1000000,2),'M') rows,\n\u00a0 concat(round(sum(data_length)\/(1024*1024*1024),2),'G') data,\n\u00a0 concat(round(sum(index_length)\/(1024*1024*1024),2),'G') idx,\n\u00a0 concat(round(sum(data_length+index_length)\/(1024*1024*1024),2),'G') total_size,\n\u00a0 round(sum(index_length)\/sum(data_length),2) idxfrac\nFROM information_schema.TABLES\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># find biggest databases<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"SELECT\n\u00a0 \u00a0 \u00a0 \u00a0 count(*) tables,\n\u00a0 \u00a0 \u00a0 \u00a0 table_schema,concat(round(sum(table_rows)\/1000000,2),'M') rows,\n\u00a0 \u00a0 \u00a0 \u00a0 concat(round(sum(data_length)\/(1024*1024*1024),2),'G') data,\n\u00a0 \u00a0 \u00a0 \u00a0 concat(round(sum(index_length)\/(1024*1024*1024),2),'G') idx,\n\u00a0 \u00a0 \u00a0 \u00a0 concat(round(sum(data_length+index_length)\/(1024*1024*1024),2),'G') total_size,\n\u00a0 \u00a0 \u00a0 \u00a0 round(sum(index_length)\/sum(data_length),2) idxfrac\n\u00a0 \u00a0 \u00a0 \u00a0 FROM information_schema.TABLES\n\u00a0 \u00a0 \u00a0 \u00a0 GROUP BY table_schema\n\u00a0 \u00a0 \u00a0 \u00a0 ORDER BY sum(data_length+index_length) DESC LIMIT 10\"<\/span><span class=\"pun\">)<\/span>\n\n<span class=\"com\"># data distribution by storage engine<\/span><span class=\"pln\">\nsqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">\"SELECT engine,\n\u00a0 \u00a0 \u00a0 \u00a0 count(*) tables,\n\u00a0 \u00a0 \u00a0 \u00a0 concat(round(sum(table_rows)\/1000000,2),'M') rows,\n\u00a0 \u00a0 \u00a0 \u00a0 concat(round(sum(data_length)\/(1024*1024*1024),2),'G') data,\n\u00a0 \u00a0 \u00a0 \u00a0 concat(round(sum(index_length)\/(1024*1024*1024),2),'G') idx,\n\u00a0 \u00a0 \u00a0 \u00a0 concat(round(sum(data_length+index_length)\/(1024*1024*1024),2),'G') total_size,\n\u00a0 \u00a0 \u00a0 \u00a0 round(sum(index_length)\/sum(data_length),2) idxfrac\n\u00a0 \u00a0 \u00a0 \u00a0 FROM information_schema.TABLES\n\u00a0 \u00a0 \u00a0 \u00a0 GROUP BY engine\n\u00a0 \u00a0 \u00a0 \u00a0 ORDER BY sum(data_length+index_length) DESC LIMIT 10\"<\/span><span class=\"pun\">)<\/span><\/pre>\n<h1><a name=\"Links\"><\/a>Links<\/h1>\n<p><a href=\"http:\/\/www.codeproject.com\/Articles\/33052\/Visual-Representation-of-SQL-Joins\" rel=\"nofollow\">Visual Representation of SQL Joins<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>To write it, it took three months; to conceive it \u2013 three minutes; to collect the data in it \u2013 all my life. F. Scott&hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[],"class_list":["post-801","post","type-post","status-publish","format-standard","hentry","category-r"],"_links":{"self":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts\/801","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/comments?post=801"}],"version-history":[{"count":0,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts\/801\/revisions"}],"wp:attachment":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/media?parent=801"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/categories?post=801"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/tags?post=801"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}