{"id":632,"date":"2014-06-05T16:03:16","date_gmt":"2014-06-05T21:03:16","guid":{"rendered":"http:\/\/homepages.uc.edu\/~yaozo\/wordpress\/?p=632"},"modified":"2014-06-05T16:03:16","modified_gmt":"2014-06-05T21:03:16","slug":"importing-data-into-r-from-different-sources-2","status":"publish","type":"post","link":"https:\/\/zhuoyao.net\/index.php\/2014\/06\/05\/importing-data-into-r-from-different-sources-2\/","title":{"rendered":"Importing Data Into R from Different Sources"},"content":{"rendered":"<p>I have found that I get data from many different sources. These sources range from simple .csv files to more complex relational databases, to structure XML or JSON files. I have compiled the different approaches that one can use to easily access these datasets.<\/p>\n<p><strong>Local Column Delimited Files<\/strong><\/p>\n<p>This is probably the most common and easiest approach to load data into R. It simply requires one line to do everything that is needed to set up the data. Then a couple additional lines to tidy up the dataset.<\/p>\n<div id=\"highlighter_313560\" class=\"syntaxhighlighter \">\n<div class=\"bar\">\n<div class=\"toolbar\"><a class=\"item viewSource\" style=\"width: 16px; height: 16px;\" title=\"view source\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#viewSource\"><span style=\"color: #0066cc;\">view source<\/span><\/a><a class=\"item printSource\" style=\"width: 16px; height: 16px;\" title=\"print\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#printSource\"><span style=\"color: #0066cc;\">print<\/span><\/a><a class=\"item about\" style=\"width: 16px; height: 16px;\" title=\"?\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#about\"><span style=\"color: #0066cc;\">?<\/span><\/a><\/div>\n<\/div>\n<div class=\"lines\">\n<pre class=\"line alt1\"><code class=\"number\">1.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">file &lt;- <\/code><code class=\"string\">\"c:\\\\my_folder\\\\my_file.txt\"<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">2.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">raw_data &lt;- read.csv(file, sep=<\/code><code class=\"string\">\",\"<\/code><code class=\"plain\">); ##<\/code><code class=\"string\">'sep'<\/code> <code class=\"plain\">can be a number of options including \\t for tab delimited<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">3.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">names(raw_data) &lt;- c(<\/code><code class=\"string\">\"VAR1\"<\/code><code class=\"plain\">,<\/code><code class=\"string\">\"VAR2\"<\/code><code class=\"plain\">,<\/code><code class=\"string\">\"RESPONSE1\"<\/code><code class=\"plain\">)<\/code><\/span><\/span><\/pre>\n<\/div>\n<\/div>\n<p><strong>Text File From the Internet<\/strong><\/p>\n<p>I find this very useful when I need to get datasets from a Web site. This is particularly useful if I need to rerun the script and the Web site continually updates their data. This save me from having to download the dataset into a csv file each time I need to run an update. In this example I use one of my favorite data sources which comes from the National Data Buoy Center. This example pulls data from a buoy (buoy #44025) off the coast of New Jersey. Conveniently you can use the same read.csv() function that you would use if read the file from you own computer. You simply replace the file location with the URL of the data.<\/p>\n<div id=\"highlighter_390\" class=\"syntaxhighlighter \">\n<div class=\"bar\">\n<div class=\"toolbar\"><a class=\"item viewSource\" style=\"width: 16px; height: 16px;\" title=\"view source\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#viewSource\"><span style=\"color: #0066cc;\">view source<\/span><\/a><a class=\"item printSource\" style=\"width: 16px; height: 16px;\" title=\"print\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#printSource\"><span style=\"color: #0066cc;\">print<\/span><\/a><a class=\"item about\" style=\"width: 16px; height: 16px;\" title=\"?\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#about\"><span style=\"color: #0066cc;\">?<\/span><\/a><\/div>\n<\/div>\n<div class=\"lines\">\n<pre class=\"line alt1\"><code class=\"number\">1.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">file &lt;- <\/code><code class=\"string\">\"&lt;a href=\"<\/code><code class=\"plain\"><a href=\"http:\/\/www.ndbc.noaa.gov\/view_text_file.php?filename=\"><span style=\"color: #0066cc;\">http:\/\/www.ndbc.noaa.gov\/view_text_file.php?filename=<\/span><\/a><\/code><code class=\"value\">44025<\/code><code class=\"plain\">h<\/code><code class=\"value\">2011<\/code><code class=\"plain\">.txt.gz&amp;dir=data\/historical\/stdmet\/<\/code><code class=\"string\">\"&gt;<a href=\"http:\/\/www.ndbc.noaa.gov\/view_text_file.php?filename=44025h2011.txt.gz&amp;\"><span style=\"color: #0066cc;\">http:\/\/www.ndbc.noaa.gov\/view_text_file.php?filename=44025h2011.txt.gz&amp;<\/span><\/a>;dir=data\/historical\/stdmet\/&lt;\/a&gt;\"<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">2.<\/code><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">3.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">raw_data &lt;- read.csv(file, header=T, skip=<\/code><code class=\"value\">1<\/code><code class=\"plain\">)<\/code><\/span><\/span><\/pre>\n<\/div>\n<\/div>\n<p><strong>Files From Other Software<\/strong><\/p>\n<p>Often I will have Excel files, SPSS files, or SAS dataset set to me. Once again I can either export the data as a csv file and then import using the <em>read.csv<\/em> function. However, taking that approach every time means that there is an additional step. By adding unnecessary steps to a process increases the risk that the data might get corrupted due to human error. Furthermore, if the data is updated from time to time then the data that you downloaded last week may not have the most current data.<\/p>\n<p>&nbsp;<\/p>\n<p><em>SPSS<\/em><\/p>\n<div id=\"highlighter_300266\" class=\"syntaxhighlighter \">\n<div class=\"bar\">\n<div class=\"toolbar\"><a class=\"item viewSource\" style=\"width: 16px; height: 16px;\" title=\"view source\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#viewSource\"><span style=\"color: #0066cc;\">view source<\/span><\/a><a class=\"item printSource\" style=\"width: 16px; height: 16px;\" title=\"print\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#printSource\"><span style=\"color: #0066cc;\">print<\/span><\/a><a class=\"item about\" style=\"width: 16px; height: 16px;\" title=\"?\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#about\"><span style=\"color: #0066cc;\">?<\/span><\/a><\/div>\n<\/div>\n<div class=\"lines\">\n<pre class=\"line alt1\"><code class=\"number\">1.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">library(foreign)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">2.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">file &lt;- <\/code><code class=\"string\">\"C:\\\\my_folder\\\\my_file.sav\"<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">3.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">raw &lt;- as.data.frame(read.spss(file))<\/code><\/span><\/span><\/pre>\n<\/div>\n<\/div>\n<pre><\/pre>\n<p><em>Microsoft Excel<\/em><\/p>\n<div id=\"highlighter_339590\" class=\"syntaxhighlighter \">\n<div class=\"bar\">\n<div class=\"toolbar\"><a class=\"item viewSource\" style=\"width: 16px; height: 16px;\" title=\"view source\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#viewSource\"><span style=\"color: #0066cc;\">view source<\/span><\/a><a class=\"item printSource\" style=\"width: 16px; height: 16px;\" title=\"print\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#printSource\"><span style=\"color: #0066cc;\">print<\/span><\/a><a class=\"item about\" style=\"width: 16px; height: 16px;\" title=\"?\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#about\"><span style=\"color: #0066cc;\">?<\/span><\/a><\/div>\n<\/div>\n<div class=\"lines\">\n<pre class=\"line alt1\"><code class=\"number\">1.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">library(XLConnect)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">2.<\/code><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">3.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">file &lt;- <\/code><code class=\"string\">\"C:\\\\my_folder\\\\my_file.xlsx\"<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">4.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">raw_wb &lt;- loadWorkbook(file, create=F)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">5.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">raw &lt;- as.data.frame( readWorksheet(raw_wb, sheet=<\/code><code class=\"string\">'Sheet1'<\/code><code class=\"plain\">) )<\/code><\/span><\/span><\/pre>\n<\/div>\n<\/div>\n<p><strong>Data From Relational Databases<\/strong><\/p>\n<p>There is the RMySQL library which is very useful. However, I have generally been in the habit of using the RODBC library. The reason for this is that I will often jump between databases (e.g. Oracle, MSSQL, MySQL). By using the RODBC library I can keep all of my connections in one location and use the same functions regardless of the databases. This example below will work on any standard SQL database. You just need to make sure you set up an ODBC connection call (in this example) MY_DATABASE.<\/p>\n<div id=\"highlighter_104314\" class=\"syntaxhighlighter \">\n<div class=\"bar\">\n<div class=\"toolbar\"><a class=\"item viewSource\" style=\"width: 16px; height: 16px;\" title=\"view source\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#viewSource\"><span style=\"color: #0066cc;\">view source<\/span><\/a><a class=\"item printSource\" style=\"width: 16px; height: 16px;\" title=\"print\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#printSource\"><span style=\"color: #0066cc;\">print<\/span><\/a><a class=\"item about\" style=\"width: 16px; height: 16px;\" title=\"?\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#about\"><span style=\"color: #0066cc;\">?<\/span><\/a><\/div>\n<\/div>\n<div class=\"lines\">\n<pre class=\"line alt1\"><code class=\"number\">1.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">library(RODBC)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">2.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">channel &lt;- odbcConnect(<\/code><code class=\"string\">\"MY_DATABASE\"<\/code><code class=\"plain\">, uid=<\/code><code class=\"string\">\"username\"<\/code><code class=\"plain\">, pwd=<\/code><code class=\"string\">\"password\"<\/code><code class=\"plain\">)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">3.<\/code><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">4.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">raw &lt;- sqlQuery(channel, <\/code><code class=\"string\">\"SELECT * FROM Table1\"<\/code><code class=\"plain\">);<\/code><\/span><\/span><\/pre>\n<\/div>\n<\/div>\n<p><strong>Data from Non-Relational Databases<\/strong><\/p>\n<p>R has the capability to pull data from non-relational databases. These include Hadoop (rhbase), Cassandra (RCassandra), MongoDB (rmongodb). I personally have not used RCassandra but here is the <a title=\"RCassandra Documentation\" href=\"http:\/\/cran.r-project.org\/web\/packages\/RCassandra\/RCassandra.pdf\"><span style=\"color: #0066cc;\">documentation<\/span><\/a>. The example here uses MongoDB using an <a title=\"MongoDB Schema Design\" href=\"http:\/\/www.mongodb.org\/display\/DOCS\/Schema+Design\"><span style=\"color: #0066cc;\">example<\/span><\/a> provided by MongoDB.<\/p>\n<div id=\"highlighter_638200\" class=\"syntaxhighlighter \">\n<div class=\"bar\">\n<div class=\"toolbar\"><a class=\"item viewSource\" style=\"width: 16px; height: 16px;\" title=\"view source\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#viewSource\"><span style=\"color: #0066cc;\">view source<\/span><\/a><a class=\"item printSource\" style=\"width: 16px; height: 16px;\" title=\"print\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#printSource\"><span style=\"color: #0066cc;\">print<\/span><\/a><a class=\"item about\" style=\"width: 16px; height: 16px;\" title=\"?\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#about\"><span style=\"color: #0066cc;\">?<\/span><\/a><\/div>\n<\/div>\n<div class=\"lines\">\n<pre class=\"line alt1\"><code class=\"number\">01.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">library(rmongodb)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">02.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">MyMongodb &lt;- <\/code><code class=\"string\">\"test\"<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">03.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">ns &lt;- <\/code><code class=\"string\">\"articles\"<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">04.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">mongo &lt;- mongo.create(db=MyMmongodb)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">05.<\/code><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">06.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">list.d &lt;- mongo.bson.from.list(list(<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">07.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"string\">\"_id\"<\/code><code class=\"plain\">=<\/code><code class=\"string\">\"wes\"<\/code><code class=\"plain\">,<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">08.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">name=list(first=<\/code><code class=\"string\">\"Wesley\"<\/code><code class=\"plain\">, last=<\/code><code class=\"string\">\"\"<\/code><code class=\"plain\">),<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">09.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">sex=<\/code><code class=\"string\">\"M\"<\/code><code class=\"plain\">,<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">10.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">age=<\/code><code class=\"value\">40<\/code><code class=\"plain\">,<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">11.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">value=c(<\/code><code class=\"string\">\"7\"<\/code><code class=\"plain\">, <\/code><code class=\"string\">\"5\"<\/code><code class=\"plain\">,<\/code><code class=\"string\">\"8\"<\/code><code class=\"plain\">,<\/code><code class=\"string\">\"2\"<\/code><code class=\"plain\">)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">12.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">))<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">13.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">mongo.insert(mongo, <\/code><code class=\"string\">\"test.MyPeople\"<\/code><code class=\"plain\">, list.d)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">14.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">list.d<\/code><code class=\"value\">2<\/code> <code class=\"plain\">&lt;- mongo.bson.from.list(list(<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">15.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"string\">\"_id\"<\/code><code class=\"plain\">=<\/code><code class=\"string\">\"Article1\"<\/code><code class=\"plain\">,<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">16.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">when=mongo.timestamp.create(strptime(<\/code><code class=\"string\">\"2012-10-01 01:30:00\"<\/code><code class=\"plain\">,<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">17.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"string\">\"%Y-%m-%d %H:%M:%s\"<\/code><code class=\"plain\">), increment=<\/code><code class=\"value\">1<\/code><code class=\"plain\">),<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">18.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">author=<\/code><code class=\"string\">\"wes\"<\/code><code class=\"plain\">,<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">19.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">title=<\/code><code class=\"string\">\"Importing Data Into R from Different Sources\"<\/code><code class=\"plain\">,<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">20.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">text=<\/code><code class=\"string\">\"Provides R code on how to import data into R from different sources.\"<\/code><code class=\"plain\">,<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">21.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">tags=c(<\/code><code class=\"string\">\"R\"<\/code><code class=\"plain\">, <\/code><code class=\"string\">\"MongoDB\"<\/code><code class=\"plain\">, <\/code><code class=\"string\">\"Cassandra\"<\/code><code class=\"plain\">,<\/code><code class=\"string\">\"MySQL\"<\/code><code class=\"plain\">,<\/code><code class=\"string\">\"Excel\"<\/code><code class=\"plain\">,<\/code><code class=\"string\">\"SPSS\"<\/code><code class=\"plain\">),<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">22.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">comments=list(<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">23.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">list(<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">24.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">who=<\/code><code class=\"string\">\"wes\"<\/code><code class=\"plain\">,<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">25.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">when=mongo.timestamp.create(strptime(<\/code><code class=\"string\">\"2012-10-01 01:35:00\"<\/code><code class=\"plain\">,<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">26.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"string\">\"%Y-%m-%d %H:%M:%s\"<\/code><code class=\"plain\">), increment=<\/code><code class=\"value\">1<\/code><code class=\"plain\">),<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">27.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">comment=<\/code><code class=\"string\">\"I'm open to comments or suggestions on other data sources to include.\"<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">28.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">29.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">30.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">31.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">32.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">list.d<\/code><code class=\"value\">2<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">33.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">mongo.insert(mongo, <\/code><code class=\"string\">\"test.MyArticles\"<\/code><code class=\"plain\">, list.d<\/code><code class=\"value\">2<\/code><code class=\"plain\">)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">34.<\/code><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">35.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">res &lt;- mongo.find(mongo, <\/code><code class=\"string\">\"test.MyArticles\"<\/code><code class=\"plain\">, query=list(author=<\/code><code class=\"string\">\"wes\"<\/code><code class=\"plain\">), fields=list(title=<\/code><code class=\"value\">1<\/code><code class=\"plain\">L))<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">36.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">out &lt;- NULL<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">37.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">while (mongo.cursor.next(res)){<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">38.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">out &lt;- c(out, list(mongo.bson.to.list(mongo.cursor.value(res))))<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">39.<\/code><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">40.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">}<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">41.<\/code><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">42.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">out<\/code><\/span><\/span><\/pre>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<p><strong>Copied and Pasted Text<\/strong><\/p>\n<div id=\"highlighter_724615\" class=\"syntaxhighlighter \">\n<div class=\"bar\">\n<div class=\"toolbar\"><a class=\"item viewSource\" style=\"width: 16px; height: 16px;\" title=\"view source\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#viewSource\"><span style=\"color: #0066cc;\">view source<\/span><\/a><a class=\"item printSource\" style=\"width: 16px; height: 16px;\" title=\"print\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#printSource\"><span style=\"color: #0066cc;\">print<\/span><\/a><a class=\"item about\" style=\"width: 16px; height: 16px;\" title=\"?\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#about\"><span style=\"color: #0066cc;\">?<\/span><\/a><\/div>\n<\/div>\n<div class=\"lines\">\n<pre class=\"line alt1\"><code class=\"number\">01.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">raw_txt &lt;- \"<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">02.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">STATE READY TOTAL<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">03.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">AL <\/code><code class=\"value\">36<\/code> <code class=\"value\">36<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">04.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">AK <\/code><code class=\"value\">5<\/code> <code class=\"value\">8<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">05.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">AZ <\/code><code class=\"value\">15<\/code> <code class=\"value\">16<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">06.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">AR <\/code><code class=\"value\">21<\/code> <code class=\"value\">27<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">07.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">CA <\/code><code class=\"value\">43<\/code> <code class=\"value\">43<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">08.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">CT <\/code><code class=\"value\">56<\/code> <code class=\"value\">68<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">09.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">DE <\/code><code class=\"value\">22<\/code> <code class=\"value\">22<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">10.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">DC <\/code><code class=\"value\">7<\/code> <code class=\"value\">7<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">11.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">FL <\/code><code class=\"value\">130<\/code> <code class=\"value\">132<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">12.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">GA <\/code><code class=\"value\">53<\/code> <code class=\"value\">54<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">13.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">HI <\/code><code class=\"value\">11<\/code> <code class=\"value\">16<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">14.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">ID <\/code><code class=\"value\">11<\/code> <code class=\"value\">11<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">15.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">IL <\/code><code class=\"value\">24<\/code> <code class=\"value\">24<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">16.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">IN <\/code><code class=\"value\">65<\/code> <code class=\"value\">77<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">17.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">IA <\/code><code class=\"value\">125<\/code> <code class=\"value\">130<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">18.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">KS <\/code><code class=\"value\">22<\/code> <code class=\"value\">26<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">19.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">KY <\/code><code class=\"value\">34<\/code> <code class=\"value\">34<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">20.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">LA <\/code><code class=\"value\">27<\/code> <code class=\"value\">34<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">21.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">ME <\/code><code class=\"value\">94<\/code> <code class=\"value\">96<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">22.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">MD <\/code><code class=\"value\">25<\/code> <code class=\"value\">26<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">23.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">MA <\/code><code class=\"value\">82<\/code> <code class=\"value\">92<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">24.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">Mi <\/code><code class=\"value\">119<\/code> <code class=\"value\">126<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">25.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">MN <\/code><code class=\"value\">69<\/code> <code class=\"value\">80<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">26.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">MS <\/code><code class=\"value\">43<\/code> <code class=\"value\">43<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">27.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">MO <\/code><code class=\"value\">74<\/code> <code class=\"value\">82<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">28.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">MT <\/code><code class=\"value\">34<\/code> <code class=\"value\">40<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">29.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">NE <\/code><code class=\"value\">9<\/code> <code class=\"value\">13<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">30.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">NV <\/code><code class=\"value\">64<\/code> <code class=\"value\">64<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">31.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">NM <\/code><code class=\"value\">120<\/code> <code class=\"value\">137<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">32.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">NY <\/code><code class=\"value\">60<\/code> <code class=\"value\">62<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">33.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">NJ <\/code><code class=\"value\">29<\/code> <code class=\"value\">33<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">34.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">NH <\/code><code class=\"value\">44<\/code> <code class=\"value\">45<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">35.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">ND <\/code><code class=\"value\">116<\/code> <code class=\"value\">135<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">36.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">NC <\/code><code class=\"value\">29<\/code> <code class=\"value\">33<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">37.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">OH <\/code><code class=\"value\">114<\/code> <code class=\"value\">130<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">38.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">OK <\/code><code class=\"value\">19<\/code> <code class=\"value\">22<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">39.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">PA <\/code><code class=\"value\">101<\/code> <code class=\"value\">131<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">40.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">RI <\/code><code class=\"value\">32<\/code> <code class=\"value\">32<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">41.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">Sc <\/code><code class=\"value\">35<\/code> <code class=\"value\">45<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">42.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">SD <\/code><code class=\"value\">25<\/code> <code class=\"value\">25<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">43.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">TN <\/code><code class=\"value\">30<\/code> <code class=\"value\">34<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">44.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">TX <\/code><code class=\"value\">14<\/code> <code class=\"value\">25<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">45.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">UT <\/code><code class=\"value\">11<\/code> <code class=\"value\">11<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">46.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">VT <\/code><code class=\"value\">33<\/code> <code class=\"value\">49<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">47.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">VA <\/code><code class=\"value\">108<\/code> <code class=\"value\">124<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">48.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">WV <\/code><code class=\"value\">27<\/code> <code class=\"value\">36<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">49.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">WI <\/code><code class=\"value\">122<\/code> <code class=\"value\">125<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">50.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">WY <\/code><code class=\"value\">12<\/code> <code class=\"value\">14<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">51.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">\"<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">52.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">raw_data &lt;- textConnection(raw_txt)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">53.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">raw &lt;- read.table(raw_data, header=TRUE, comment.char=<\/code><code class=\"string\">\"#\"<\/code><code class=\"plain\">, sep=<\/code><code class=\"string\">\"\"<\/code><code class=\"plain\">)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">54.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">close.connection(raw_data)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">55.<\/code><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">56.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">raw<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">57.<\/code><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">58.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">###Or the following line can be used<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">59.<\/code><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">60.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">raw &lt;- read.table(header=TRUE, text=raw_txt)<\/code><\/span><\/span><\/pre>\n<\/div>\n<\/div>\n<p><strong> Structured Local or Remote Data<\/strong><\/p>\n<p>One feature that I find quite useful is when there is a Web site with a table that I want to analyze. R has the capability to read through the HTML and import the table that you want. This example uses the <em>XML<\/em> library and pulls down the population by country in the world. Once the data is brought into R it may need to be cleaned up a bit removing unnecessary columns and other stray characters. The examples here use remote data from other Web sites. If the data is available as a local file then it can be imported in a similar fashion just using filename rather than the URL.<\/p>\n<div id=\"highlighter_380681\" class=\"syntaxhighlighter \">\n<div class=\"bar\">\n<div class=\"toolbar\"><a class=\"item viewSource\" style=\"width: 16px; height: 16px;\" title=\"view source\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#viewSource\"><span style=\"color: #0066cc;\">view source<\/span><\/a><a class=\"item printSource\" style=\"width: 16px; height: 16px;\" title=\"print\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#printSource\"><span style=\"color: #0066cc;\">print<\/span><\/a><a class=\"item about\" style=\"width: 16px; height: 16px;\" title=\"?\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#about\"><span style=\"color: #0066cc;\">?<\/span><\/a><\/div>\n<\/div>\n<div class=\"lines\">\n<pre class=\"line alt1\"><code class=\"number\">1.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">library(XML)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">2.<\/code><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">3.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"value\">url<\/code> <code class=\"plain\">&lt;- <\/code><code class=\"string\">\"<a href=\"http:\/\/en.wikipedia.org\/wiki\/List_of_countries_by_population\"><span style=\"color: #0066cc;\">http:\/\/en.wikipedia.org\/wiki\/List_of_countries_by_population<\/span><\/a>\"<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">4.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">population = readHTMLTable(<\/code><code class=\"value\">url<\/code><code class=\"plain\">, which=<\/code><code class=\"value\">3<\/code><code class=\"plain\">)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">5.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">population<\/code><\/span><\/span><\/pre>\n<\/div>\n<\/div>\n<p>Or you can use the feature to simple grab XML content. I have found this particularly useful when I need geospatial data and need to get the latitude\/longitude of a location (this example uses Open Street Maps API provided by MapQuest). This example obtains the results for the coordinates of the United States White House.<\/p>\n<div id=\"highlighter_693508\" class=\"syntaxhighlighter \">\n<div class=\"bar\">\n<div class=\"toolbar\"><a class=\"item viewSource\" style=\"width: 16px; height: 16px;\" title=\"view source\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#viewSource\"><span style=\"color: #0066cc;\">view source<\/span><\/a><a class=\"item printSource\" style=\"width: 16px; height: 16px;\" title=\"print\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#printSource\"><span style=\"color: #0066cc;\">print<\/span><\/a><a class=\"item about\" style=\"width: 16px; height: 16px;\" title=\"?\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#about\"><span style=\"color: #0066cc;\">?<\/span><\/a><\/div>\n<\/div>\n<div class=\"lines\">\n<pre class=\"line alt1\"><code class=\"number\">1.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"value\">url<\/code> <code class=\"plain\">&lt;- <\/code><code class=\"string\">\"<a href=\"http:\/\/open.mapquestapi.com\/geocoding\/v1\/address?location=1600%20Pennsylvania%20Ave\"><span style=\"color: #0066cc;\">http:\/\/open.mapquestapi.com\/geocoding\/v1\/address?location=1600%20Pennsylvania%20Ave<\/span><\/a>,%20Washington,%20DC&amp;outFormat=xml\"<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">2.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">mygeo &lt;- xmlToDataFrame(<\/code><code class=\"value\">url<\/code><code class=\"plain\">)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">3.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">mygeo$result<\/code><\/span><\/span><\/pre>\n<\/div>\n<\/div>\n<pre>An alternate approach is to use a JSON format. I generally find that JSON is a better format and it can be readily used in most programming languages.<\/pre>\n<div id=\"highlighter_318994\" class=\"syntaxhighlighter \">\n<div class=\"toolbar\"><a class=\"item viewSource\" style=\"width: 16px; height: 16px;\" title=\"view source\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#viewSource\"><span style=\"color: #0066cc;\">view source<\/span><\/a><a class=\"item printSource\" style=\"width: 16px; height: 16px;\" title=\"print\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#printSource\"><span style=\"color: #0066cc;\">print<\/span><\/a><a class=\"item about\" style=\"width: 16px; height: 16px;\" title=\"?\" href=\"http:\/\/statistical-research.com\/importing-data-into-r-from-different-sources\/#about\"><span style=\"color: #0066cc;\">?<\/span><\/a><\/div>\n<div class=\"lines\">\n<pre class=\"line alt1\"><code class=\"number\">1.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">library(rjson)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">2.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"value\">url<\/code> <code class=\"plain\">&lt;- <\/code><code class=\"string\">\"<a href=\"http:\/\/open.mapquestapi.com\/geocoding\/v1\/address?location=1600%20Pennsylvania%20Ave\"><span style=\"color: #0066cc;\">http:\/\/open.mapquestapi.com\/geocoding\/v1\/address?location=1600%20Pennsylvania%20Ave<\/span><\/a>,%20Washington,%20DC&amp;outFormat=json\"<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">3.<\/code><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">4.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">raw_json &lt;- scan(<\/code><code class=\"value\">url<\/code><code class=\"plain\">, <\/code><code class=\"string\">\"\"<\/code><code class=\"plain\">, sep=<\/code><code class=\"string\">\"\\n\"<\/code><code class=\"plain\">)<\/code><\/span><\/span><\/pre>\n<pre class=\"line alt1\"><code class=\"number\">5.<\/code><\/pre>\n<pre class=\"line alt2\"><code class=\"number\">6.<\/code><span class=\"content\"><span class=\"block\" style=\"margin-left: 0px !important;\"><code class=\"plain\">mygeo &lt;- fromJSON(raw_json)<\/code><\/span><\/span><\/pre>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>I have found that I get data from many different sources. These sources range from simple .csv files to more complex relational databases, to structure&hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[],"class_list":["post-632","post","type-post","status-publish","format-standard","hentry","category-r"],"_links":{"self":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts\/632","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/comments?post=632"}],"version-history":[{"count":0,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts\/632\/revisions"}],"wp:attachment":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/media?parent=632"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/categories?post=632"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/tags?post=632"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}