{"id":683,"date":"2014-11-05T11:32:22","date_gmt":"2014-11-05T18:32:22","guid":{"rendered":"http:\/\/homepages.uc.edu\/~yaozo\/wordpress\/?p=683"},"modified":"2014-11-05T11:32:22","modified_gmt":"2014-11-05T18:32:22","slug":"export-data-frames-to-multi-worksheet-excel-file-2","status":"publish","type":"post","link":"https:\/\/zhuoyao.net\/index.php\/2014\/11\/05\/export-data-frames-to-multi-worksheet-excel-file-2\/","title":{"rendered":"Export Data Frames To Multi-worksheet Excel File"},"content":{"rendered":"<p>A few weeks ago I needed to export a number of data frames to separate worksheets in an Excel file. Although one could output csv-files from R and then import them manually or with the help of VBA into Excel, I was after a more streamlined solution, as I would need to repeat this process quite regularly in the future.<\/p>\n<p>CRAN has several packages that offer the functionality of creating an Excel file, however several of them provide only the very basic functionality. The <a href=\"http:\/\/rwiki.sciviews.org\/doku.php?id=tips:data-io:ms_windows\" target=\"_blank\" rel=\"noopener\">R-wiki page<\/a> on exchanging data between R and Windows applications focuses mainly on the data import problem.<\/p>\n<p>My objective was to find an export method that would allow me to easily split a larger dataframe by values of a given variable so that each subset would be exported to its own worksheet in the same Excel file. I tried out the different ways of achieving this and documented my findings below.<\/p>\n<hr \/>\n<h2><a name=\"_data_preparation\"><\/a>Data Preparation<\/h2>\n<p>The goal is to split the <tt>iris<\/tt> dataset by the unique values of <tt>iris$Species<\/tt>, and export it to a new Excel file with worksheets: setosa, versicolor and virginica.<\/p>\n<p>Two ways of storing the information were used for the purposes of this exercise:<\/p>\n<h3><a name=\"_list_of_dataframes\"><\/a>List of dataframes<\/h3>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(plyr)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; testlist &lt;- dlply(iris, .(Species))<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3><a name=\"_a_character_vector_of_containing_names_of_data_frames\"><\/a>A character vector of containing names of data frames<\/h3>\n<p>First create the data frames, and then the character vector with the object names \u2013 now we will have three separate data frames called setosa, versicolor and virginica.<\/p>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; d_ply(iris, .(Species), function(df) {\n     assign(as.character(df$Species[1]), df,\n         envir = .GlobalEnv)\n })\n&gt; testchar &lt;- as.character(unique(iris$Species))\n&gt; testchar1 &lt;- paste(testchar, collapse = \",\")<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Storing data in a list of dataframes is obviously much more convenient, however, as you see later many (or most) functions don\u2019t accept lists as input.<\/p>\n<hr \/>\n<h2><a name=\"_dataframes2xls\"><\/a>dataframes2xls<\/h2>\n<table>\n<tbody>\n<tr valign=\"top\">\n<td><b><span style=\"text-decoration: underline;\">Note<\/span><\/b><\/td>\n<td>Requires <tt>Python (&gt;= 2.4)<\/tt><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><a href=\"http:\/\/cran.r-project.org\/web\/packages\/dataframes2xls\/index.html\" target=\"_blank\" rel=\"noopener\"><tt>dataframes2xls<\/tt><\/a> saves dataframes to an xls file. Its main function<tt>write.xls<\/tt>, is a wrapper around a utility called <tt>xls2csv<\/tt>. <tt>xls2csv<\/tt> makes use of the <tt>Python<\/tt> module <a href=\"http:\/\/sourceforge.net\/projects\/pyexcelerator\/\" target=\"_blank\" rel=\"noopener\"><tt>pyExcelerator<\/tt><\/a> and the <tt>afm<\/tt> submodule of the<tt>Python<\/tt> module <a href=\"http:\/\/sourceforge.net\/projects\/matplotlib\/\" target=\"_blank\" rel=\"noopener\"><tt>matplotlib<\/tt><\/a>, both of which are included in<tt>dataframes2xls<\/tt>.<\/p>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(dataframes2xls)\n&gt; dataframes2xls::write.xls(c(setosa, versicolor,\n     virginica), \"dataframes2xls.xls\")<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Appending to an existing file not possible, this negates the use of list of dataframes as input. One major shortcoming is that one needs to specify the names of exported dataframes manually. Also, I was not able to pass sheet names to the function if there was more than one sheet.<\/p>\n<p>There is a way to work around the manual specification of data frames, but as you can see it is not very intuitive:<\/p>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; eval(parse(text = paste(\"dataframes2xls::write.xls(c(\",\n     testchar1, \"),\", \"'dataframes2xls.xls')\")))<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<hr \/>\n<h2><a name=\"_writexls\"><\/a>WriteXLS<\/h2>\n<table>\n<tbody>\n<tr valign=\"top\">\n<td><b><span style=\"text-decoration: underline;\">Note<\/span><\/b><\/td>\n<td>Requires <tt>Perl<\/tt> with module <tt>Text::CSV_XS<\/tt><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><a href=\"http:\/\/cran.r-project.org\/web\/packages\/WriteXLS\/\" target=\"_blank\" rel=\"noopener\">WriteXLS<\/a> is a \u201c<tt>Perl<\/tt> based R function to create Excel (XLS) files from one or more data frames. Each data frame will be written to a separate named worksheet in the Excel spreadsheet. The worksheet name will be the name of the data frame it contains or can be specified by the user\u201d.<\/p>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(WriteXLS)\n&gt; WriteXLS(testchar, \"WriteXLS.xls\", perl = perl)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Like <tt>dataframes2xls<\/tt> <tt>WriteXLS<\/tt> does not take lists as input, and therefore each of the data frames needs to be generated beforehand before calling the function(s). Also appending to a file is not possible as the Excel file, if it exists, is overwritten.<\/p>\n<hr \/>\n<h2><a name=\"_xlsreadwrite_pro\"><\/a>xlsReadWrite(Pro)<\/h2>\n<table>\n<tbody>\n<tr valign=\"top\">\n<td><b><span style=\"text-decoration: underline;\">Note<\/span><\/b><\/td>\n<td>This package currently works only on Windows machines.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><a href=\"http:\/\/treetron.googlepages.com\/xlsreadwrite.htm\" target=\"_blank\" rel=\"noopener\">xlsReadWrite<\/a> saves a data frame, matrix or vector as an Excel file in Excel 97-2003 file format.<\/p>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(xlsReadWrite)\n&gt; xlsReadWrite::write.xls(iris, \"xlsReadWrite.xls\")<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The dataframe can be written to one sheet only. It is not possible to split the data between separate sheets, as append data to existing files feature is available in the <a href=\"http:\/\/treetron.googlepages.com\/xlsreadwrite2.htm\" target=\"_blank\" rel=\"noopener\">Pro-version<\/a>.<\/p>\n<p>The <a href=\"http:\/\/treetron.googlepages.com\/xlsreadwrite2.htm\" target=\"_blank\" rel=\"noopener\">Pro-version<\/a> has a 30-day trial, so I tried it out.<\/p>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(xlsReadWritePro)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; rfile &lt;- \"xlsReadWritePro.xls\"\n&gt; exc &lt;- xls.new(rfile)\n&gt; l_ply(testlist, function(x) {\n     sheet &lt;- as.character(unique(x$Species))\n     xlsReadWritePro::write.xls(x, file = exc,\n         sheet = sheet, colNames = TRUE)\n })\n&gt; xls.close(exc)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Task accomplished.<\/p>\n<p>The Pro-version has several other nice features, such as the ability to save images to an Excel file, or to write Excel formulas.<\/p>\n<p>For reference, the <a href=\"http:\/\/treetron.googlepages.com\/buy.htm\" target=\"_blank\" rel=\"noopener\">licence costs<\/a> are as follows.<\/p>\n<ul>\n<li>Single user license: 75 euros<\/li>\n<li>Non-commercial single user: 19 euros<\/li>\n<li>Company\/university wide: 570 euros<\/li>\n<\/ul>\n<hr \/>\n<h2><a name=\"_rodbc\"><\/a>RODBC<\/h2>\n<table>\n<tbody>\n<tr valign=\"top\">\n<td><b><span style=\"text-decoration: underline;\">Note<\/span><\/b><\/td>\n<td>There are ODBC Excel drivers for Windows only.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><a href=\"http:\/\/cran.r-project.org\/web\/packages\/RODBC\/index.html\" target=\"_blank\" rel=\"noopener\">RODBC<\/a> <tt>sqlSave<\/tt> function saves the data frame in the specified worksheet via<tt>ODBC<\/tt> after initiating the connection using a convenience wrapper<tt>odbcConnectExcel<\/tt>.<\/p>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(RODBC)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; save2excel &lt;- function(x) sqlSave(xlsFile,\n     x, tablename = x$Species[1], rownames = FALSE)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; xlsFile &lt;- odbcConnectExcel(\"RODBC.xls\", readOnly = FALSE)\n&gt; l_ply(testlist, save2excel)\n&gt; odbcCloseAll()<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>This worked well. Another good thing to note is that you can append to already existing files.<\/p>\n<hr \/>\n<h2><a name=\"_rdcomclient\"><\/a>RDCOMClient<\/h2>\n<table>\n<tbody>\n<tr valign=\"top\">\n<td><b><span style=\"text-decoration: underline;\">Note<\/span><\/b><\/td>\n<td>It requires a Windows machine that has Excel running on it.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><a href=\"http:\/\/www.omegahat.org\/RDCOMClient\/\" target=\"_blank\" rel=\"noopener\"><tt>RDCOMClient<\/tt><\/a> allows to access and control applications such as Excel, Word, PowerPoint, Web browsers from within R session. As an alternative, <a href=\"http:\/\/cran.r-project.org\/web\/packages\/rcom\/index.html\" target=\"_blank\" rel=\"noopener\"><tt>rcom<\/tt><\/a>package provides similar functionality.<\/p>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; library(RDCOMClient)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Developer\u2019s website provides some useful functions for exporting\/importing dataframes to\/from Excel.<\/p>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; source(\"http:\/\/www.omegahat.org\/RDCOMClient\/examples\/excelUtils3.R\")<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; xls &lt;- COMCreate(\"Excel.Application\")\n&gt; xls[[\"Visible\"]] &lt;- TRUE\n&gt; wb = xls[[\"Workbooks\"]]$Add(1)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; rdcomexport &lt;- function(x) {\n     sh = wb[[\"Worksheets\"]]$Add()\n     sh[[\"Name\"]] &lt;- as.character(x$Species[1])\n     exportDataFrame(x, at = sh$Range(\"A1\"))\n }\n&gt; d_ply(iris, .(Species), rdcomexport)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Now I have an Excel file open with an empty \u201cSheet1\u2033. I delete it and, after specifying the save-directory (it defaults to <tt>My Documents<\/tt>) &amp; filename , save the file. I have Office 2007 installed, so this file format is used by default.<\/p>\n<table border=\"0\" width=\"100%\" bgcolor=\"#e8e8e8\">\n<tbody>\n<tr>\n<td>\n<pre>&gt; xls$Sheets(\"Sheet1\")$Delete()\n&gt; filename &lt;- paste(getwd(), \"RDCOMClient.xlsx\",\n     sep = \"\/\")\n&gt; filename &lt;- gsub('\/', '\\', filename)\n&gt; wb$SaveAs(filename)\n&gt; wb$Close(filename)<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Another example of <tt>RDCOMClient<\/tt> in action can be seen in <a href=\"http:\/\/www.nabble.com\/Excel-Export-in-a-beauty-way-td23850532.html\" target=\"_blank\" rel=\"noopener\">this R-help post<\/a>.<\/p>\n<hr \/>\n<h2><a name=\"_conclusion\"><\/a>Conclusion<\/h2>\n<p>The Windows-only solutions were (expectedly) the most flexible with<tt>RDCOMClient<\/tt>-approach providing the greatest control over output. However, if formatting was not that important the simplest way to export data to multiple Excel worksheets would be via <tt>RODBC<\/tt> (if on Windows machine).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A few weeks ago I needed to export a number of data frames to separate worksheets in an Excel file. Although one could output csv-files&hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[],"class_list":["post-683","post","type-post","status-publish","format-standard","hentry","category-r"],"_links":{"self":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts\/683","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/comments?post=683"}],"version-history":[{"count":0,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts\/683\/revisions"}],"wp:attachment":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/media?parent=683"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/categories?post=683"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/tags?post=683"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}