{"id":818,"date":"2015-03-13T09:25:15","date_gmt":"2015-03-13T16:25:15","guid":{"rendered":"http:\/\/homepages.uc.edu\/~yaozo\/wordpress\/?p=818"},"modified":"2015-03-13T09:25:15","modified_gmt":"2015-03-13T16:25:15","slug":"compare-two-data-frames-to-find-the-rows-in-data-frame-1-that-are-not-present-in-data-frame-2","status":"publish","type":"post","link":"https:\/\/zhuoyao.net\/index.php\/2015\/03\/13\/compare-two-data-frames-to-find-the-rows-in-data-frame-1-that-are-not-present-in-data-frame-2\/","title":{"rendered":"Compare two data.frames to find the rows in data.frame 1 that are not present in data.frame 2"},"content":{"rendered":"<div class=\"post-text\">\n<p>SQLDF provides a nice solution<\/p>\n<pre class=\"lang-r prettyprint prettyprinted\"><code><span class=\"pln\">a1 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data.frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">5<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> b<\/span><span class=\"pun\">=<\/span><span class=\"pln\">letters<\/span><span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">5<\/span><span class=\"pun\">])<\/span><span class=\"pln\">\na2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> data.frame<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a <\/span><span class=\"pun\">=<\/span> <span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">3<\/span><span class=\"pun\">,<\/span><span class=\"pln\"> b<\/span><span class=\"pun\">=<\/span><span class=\"pln\">letters<\/span><span class=\"pun\">[<\/span><span class=\"lit\">1<\/span><span class=\"pun\">:<\/span><span class=\"lit\">3<\/span><span class=\"pun\">])<\/span><span class=\"pln\">\n\nrequire<\/span><span class=\"pun\">(<\/span><span class=\"pln\">sqldf<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\na1NotIna2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'SELECT * FROM a1 EXCEPT SELECT * FROM a2'<\/span><span class=\"pun\">)<\/span><\/code><\/pre>\n<p>And the rows which are in both data frames:<\/p>\n<pre class=\"lang-r prettyprint prettyprinted\"><code><span class=\"pln\">a1Ina2 <\/span><span class=\"pun\">&lt;-<\/span><span class=\"pln\"> sqldf<\/span><span class=\"pun\">(<\/span><span class=\"str\">'SELECT * FROM a1 INTERSECT SELECT * FROM a2'<\/span><span class=\"pun\">)<\/span><\/code><\/pre>\n<p>The new version of dplyr has a function, anti_join, for exactly this kinds of comparisons<\/p>\n<pre class=\"lang-r prettyprint prettyprinted\"><code><span class=\"pln\">require<\/span><span class=\"pun\">(<\/span><span class=\"pln\">dplyr<\/span><span class=\"pun\">)<\/span><span class=\"pln\">\n\nanti_join<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a1<\/span><span class=\"pun\">,<\/span><span class=\"pln\">a2<\/span><span class=\"pun\">)<\/span><\/code><\/pre>\n<p>And semi_join to filter rows in a1 that are also in a2<\/p>\n<pre class=\"lang-r prettyprint prettyprinted\"><code><span class=\"pln\">semi_join<\/span><span class=\"pun\">(<\/span><span class=\"pln\">a1<\/span><span class=\"pun\">,<\/span><span class=\"pln\">a2<\/span><span class=\"pun\">)<\/span><\/code><\/pre>\n<\/div>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>SQLDF provides a nice solution a1 &lt;- data.frame(a = 1:5, b=letters[1:5]) a2 &lt;- data.frame(a = 1:3, b=letters[1:3]) require(sqldf) a1NotIna2 &lt;- sqldf(&#8216;SELECT * FROM a1 EXCEPT&hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[],"class_list":["post-818","post","type-post","status-publish","format-standard","hentry","category-r"],"_links":{"self":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts\/818","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/comments?post=818"}],"version-history":[{"count":0,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/posts\/818\/revisions"}],"wp:attachment":[{"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/media?parent=818"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/categories?post=818"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/zhuoyao.net\/index.php\/wp-json\/wp\/v2\/tags?post=818"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}