{"id":338,"date":"2013-11-25T13:01:24","date_gmt":"2013-11-25T18:01:24","guid":{"rendered":"http:\/\/homepages.uc.edu\/~yaozo\/wordpress\/?p=338"},"modified":"2013-11-25T13:01:24","modified_gmt":"2013-11-25T18:01:24","slug":"export-data-frames-to-multi-worksheet-excel-file","status":"publish","type":"post","link":"https:\/\/zhuoyao.net\/index.php\/2013\/11\/25\/export-data-frames-to-multi-worksheet-excel-file\/","title":{"rendered":"Export Data Frames To Multi-worksheet Excel File"},"content":{"rendered":"<p>A few weeks ago I needed to export a number of data frames to separate worksheets in an Excel file. Although one could output csv-files from R and then import them manually or with the help of VBA into Excel, I was after a more streamlined solution, as I would need to repeat this process quite regularly in the future.<\/p>\n<p>CRAN has several packages that offer the functionality of creating an Excel file, however several of them provide only the very basic functionality. The\u00a0<a href=\"http:\/\/wiki.r-project.org\/rwiki\/doku.php?id=tips:data-io:ms_windows\">R-wiki page<\/a>\u00a0on exchanging data between R and Windows applications focuses mainly on the data import problem.<\/p>\n<p>My objective was to find an export method that would allow me to easily split a larger dataframe by values of a given variable so that each subset would be exported to its own worksheet in the same Excel file. I tried out the different ways of achieving this and documented my findings below.<\/p>\n<p>&nbsp;<\/p>\n<hr \/>\n<h2><a name=\"_data_preparation\"><\/a>Data Preparation<\/h2>\n<p>The goal is to split the\u00a0<tt>iris<\/tt>\u00a0dataset by the unique values of\u00a0<tt>iris$Species<\/tt>, and export it to a new Excel file with worksheets: setosa, versicolor and virginica.<\/p>\n<p>Two ways of storing the information were used for the purposes of this exercise:<\/p>\n<h3><a name=\"_list_of_dataframes\"><\/a>List of dataframes<\/h3>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(plyr)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; testlist &lt;- dlply(iris, .(Species))<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3><a name=\"_a_character_vector_of_containing_names_of_data_frames\"><\/a>A character vector of containing names of data frames<\/h3>\n<p>First create the data frames, and then the character vector with the object names \u2013 now we will have three separate data frames called setosa, versicolor and virginica.<\/p>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; d_ply(iris, .(Species), function(df) {\n     assign(as.character(df$Species[1]), df,\n         envir = .GlobalEnv)\n })\n&gt; testchar &lt;- as.character(unique(iris$Species))\n&gt; testchar1 &lt;- paste(testchar, collapse = \",\")<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Storing data in a list of dataframes is obviously much more convenient, however, as you see later many (or most) functions don\u2019t accept lists as input.<\/p>\n<hr \/>\n<h2><a name=\"_dataframes2xls\"><\/a>dataframes2xls<\/h2>\n<table>\n<tbody>\n<tr valign=\"top\">\n<td><b><span style=\"text-decoration: underline;\">Note<\/span><\/b><\/td>\n<td>Requires\u00a0<tt>Python (&gt;= 2.4)<\/tt><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><a href=\"http:\/\/cran.r-project.org\/web\/packages\/dataframes2xls\/index.html\"><tt>dataframes2xls<\/tt><\/a>\u00a0saves dataframes to an xls file. Its main function\u00a0<tt>write.xls<\/tt>, is a wrapper around a utility called\u00a0<tt>xls2csv<\/tt>.\u00a0<tt>xls2csv<\/tt>\u00a0makes use of the\u00a0<tt>Python<\/tt>\u00a0module\u00a0<a href=\"http:\/\/sourceforge.net\/projects\/pyexcelerator\/\"><tt>pyExcelerator<\/tt><\/a>\u00a0and the\u00a0<tt>afm<\/tt>submodule of the\u00a0<tt>Python<\/tt>\u00a0module\u00a0<a href=\"http:\/\/sourceforge.net\/projects\/matplotlib\/\"><tt>matplotlib<\/tt><\/a>, both of which are included in\u00a0<tt>dataframes2xls<\/tt>.<\/p>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(dataframes2xls)\n&gt; dataframes2xls::write.xls(c(setosa, versicolor,\n     virginica), \"dataframes2xls.xls\")<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Appending to an existing file not possible, this negates the use of list of dataframes as input. One major shortcoming is that one needs to specify the names of exported dataframes manually. Also, I was not able to pass sheet names to the function if there was more than one sheet.<\/p>\n<p>There is a way to work around the manual specification of data frames, but as you can see it is not very intuitive:<\/p>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; eval(parse(text = paste(\"dataframes2xls::write.xls(c(\",\n     testchar1, \"),\", \"'dataframes2xls.xls')\")))<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<hr \/>\n<h2><a name=\"_writexls\"><\/a>WriteXLS<\/h2>\n<table>\n<tbody>\n<tr valign=\"top\">\n<td><b><span style=\"text-decoration: underline;\">Note<\/span><\/b><\/td>\n<td>Requires\u00a0<tt>Perl<\/tt>\u00a0with module\u00a0<tt>Text::CSV_XS<\/tt><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><a href=\"http:\/\/cran.r-project.org\/web\/packages\/WriteXLS\/\">WriteXLS<\/a>\u00a0is a \u201c<tt>Perl<\/tt>\u00a0based R function to create Excel (XLS) files from one or more data frames. Each data frame will be written to a separate named worksheet in the Excel spreadsheet. The worksheet name will be the name of the data frame it contains or can be specified by the user\u201d.<\/p>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(WriteXLS)\n&gt; WriteXLS(testchar, \"WriteXLS.xls\", perl = perl)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Like\u00a0<tt>dataframes2xls<\/tt>\u00a0<tt>WriteXLS<\/tt>\u00a0does not take lists as input, and therefore each of the data frames needs to be generated beforehand before calling the function(s). Also appending to a file is not possible as the Excel file, if it exists, is overwritten.<\/p>\n<hr \/>\n<h2><a name=\"_xlsreadwrite_pro\"><\/a>xlsReadWrite(Pro)<\/h2>\n<table>\n<tbody>\n<tr valign=\"top\">\n<td><b><span style=\"text-decoration: underline;\">Note<\/span><\/b><\/td>\n<td>This package currently works only on Windows machines.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><a href=\"http:\/\/treetron.googlepages.com\/xlsreadwrite.htm\">xlsReadWrite<\/a>\u00a0saves a data frame, matrix or vector as an Excel file in Excel 97-2003 file format.<\/p>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(xlsReadWrite)\n&gt; xlsReadWrite::write.xls(iris, \"xlsReadWrite.xls\")<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The dataframe can be written to one sheet only. It is not possible to split the data between separate sheets, as append data to existing files feature is available in the\u00a0<a href=\"http:\/\/treetron.googlepages.com\/xlsreadwrite2.htm\">Pro-version<\/a>.<\/p>\n<p>The\u00a0<a href=\"http:\/\/treetron.googlepages.com\/xlsreadwrite2.htm\">Pro-version<\/a>\u00a0has a 30-day trial, so I tried it out.<\/p>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(xlsReadWritePro)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; rfile &lt;- \"xlsReadWritePro.xls\"\n&gt; exc &lt;- xls.new(rfile)\n&gt; l_ply(testlist, function(x) {\n     sheet &lt;- as.character(unique(x$Species))\n     xlsReadWritePro::write.xls(x, file = exc,\n         sheet = sheet, colNames = TRUE)\n })\n&gt; xls.close(exc)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Task accomplished.<\/p>\n<p>The Pro-version has several other nice features, such as the ability to save images to an Excel file, or to write Excel formulas.<\/p>\n<p>For reference, the\u00a0<a href=\"http:\/\/treetron.googlepages.com\/buy.htm\">licence costs<\/a>\u00a0are as follows.<\/p>\n<ul>\n<li>Single user license: 75 euros<\/li>\n<li>Non-commercial single user: 19 euros<\/li>\n<li>Company\/university wide: 570 euros<\/li>\n<\/ul>\n<hr \/>\n<h2><a name=\"_rodbc\"><\/a>RODBC<\/h2>\n<table>\n<tbody>\n<tr valign=\"top\">\n<td><b><span style=\"text-decoration: underline;\">Note<\/span><\/b><\/td>\n<td>There are ODBC Excel drivers for Windows only.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><a href=\"http:\/\/cran.r-project.org\/web\/packages\/RODBC\/index.html\">RODBC<\/a>\u00a0<tt>sqlSave<\/tt>\u00a0function saves the data frame in the specified worksheet via\u00a0<tt>ODBC<\/tt>\u00a0after initiating the connection using a convenience wrapper\u00a0<tt>odbcConnectExcel<\/tt>.<\/p>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(RODBC)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; save2excel &lt;- function(x) sqlSave(xlsFile,\n     x, tablename = x$Species[1], rownames = FALSE)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; xlsFile &lt;- odbcConnectExcel(\"RODBC.xls\", readOnly = FALSE)\n&gt; l_ply(testlist, save2excel)\n&gt; odbcCloseAll()<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>This worked well. Another good thing to note is that you can append to already existing files.<\/p>\n<hr \/>\n<h2><a name=\"_rdcomclient\"><\/a>RDCOMClient<\/h2>\n<table>\n<tbody>\n<tr valign=\"top\">\n<td><b><span style=\"text-decoration: underline;\">Note<\/span><\/b><\/td>\n<td>It requires a Windows machine that has Excel running on it.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><a href=\"http:\/\/www.omegahat.org\/RDCOMClient\/\"><tt>RDCOMClient<\/tt><\/a>\u00a0allows to access and control applications such as Excel, Word, PowerPoint, Web browsers from within R session. As an alternative,\u00a0<a href=\"http:\/\/cran.r-project.org\/web\/packages\/rcom\/index.html\"><tt>rcom<\/tt><\/a>\u00a0package provides similar functionality.<\/p>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(RDCOMClient)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Developer\u2019s website provides some useful functions for exporting\/importing dataframes to\/from Excel.<\/p>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; source(\"http:\/\/www.omegahat.org\/RDCOMClient\/examples\/excelUtils3.R\")<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; xls &lt;- COMCreate(\"Excel.Application\")\n&gt; xls[[\"Visible\"]] &lt;- TRUE\n&gt; wb = xls[[\"Workbooks\"]]$Add(1)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; rdcomexport &lt;- function(x) {\n     sh = wb[[\"Worksheets\"]]$Add()\n     sh[[\"Name\"]] &lt;- as.character(x$Species[1])\n     exportDataFrame(x, at = sh$Range(\"A1\"))\n }\n&gt; d_ply(iris, .(Species), rdcomexport)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Now I have an Excel file open with an empty \u201cSheet1\u2033. I delete it and, after specifying the save-directory (it defaults to\u00a0<tt>My Documents<\/tt>) &amp; filename , save the file. I have Office 2007 installed, so this file format is used by default.<\/p>\n<table width=\"100%\" border=\"0\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; xls$Sheets(\"Sheet1\")$Delete()\n&gt; filename &lt;- paste(getwd(), \"RDCOMClient.xlsx\",\n     sep = \"\/\")\n&gt; filename &lt;- gsub('\/', '\\\\', filename)\n&gt; wb$SaveAs(filename)\n&gt; wb$Close(filename)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Another example of\u00a0<tt>RDCOMClient<\/tt>\u00a0in action can be seen in\u00a0<a href=\"http:\/\/www.nabble.com\/Excel-Export-in-a-beauty-way-td23850532.html\">this R-help post<\/a>.<\/p>\n<hr \/>\n<h2><a name=\"_conclusion\"><\/a>Conclusion<\/h2>\n<p>The Windows-only solutions were (expectedly) the most flexible with\u00a0<tt>RDCOMClient<\/tt>-approach providing the greatest control over output. However, if formatting was not that important the simplest way to export data to multiple Excel worksheets would be via\u00a0<tt>RODBC<\/tt>\u00a0(if on Windows machine).<\/p>\n<div><a href=\"http:\/\/en.wordpress.com\/about-these-ads\/\" rel=\"nofollow\">About these ads<\/a>&nbsp;<\/p>\n<\/div>\n<div id=\"jp-post-flair\"><\/div>\n","protected":false},"excerpt":{"rendered":"<p>A few weeks ago I needed to export a number of data frames to separate worksheets in an Excel file. Although one could output csv-files&hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[],"class_list":["post-338","post","type-post","status-publish","format-standard","hentry","category-r"],"_links":{"self":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts\/338","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/comments?post=338"}],"version-history":[{"count":0,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts\/338\/revisions"}],"wp:attachment":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/media?parent=338"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/categories?post=338"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/tags?post=338"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}