Joining CSV files in UbuntuIs there a way to do an inner join in Excel between 2 csv files?Emails in CSV B...
Need help identifying/translating a plaque in Tangier, Morocco
New order #4: World
How is it possible for user's password to be changed after storage was encrypted? (on OS X, Android)
Finding files for which a command fails
Doomsday-clock for my fantasy planet
Does it makes sense to buy a new cycle to learn riding?
Ideas for 3rd eye abilities
What is GPS' 19 year rollover and does it present a cybersecurity issue?
Domain expired, GoDaddy holds it and is asking more money
How to make payment on the internet without leaving a money trail?
How would photo IDs work for shapeshifters?
"My colleague's body is amazing"
Manga about a female worker who got dragged into another world together with this high school girl and she was just told she's not needed anymore
What do you call something that goes against the spirit of the law, but is legal when interpreting the law to the letter?
Are objects structures and/or vice versa?
How to move the player while also allowing forces to affect it
Lied on resume at previous job
Patience, young "Padovan"
What is the offset in a seaplane's hull?
Copycat chess is back
aging parents with no investments
Is domain driven design an anti-SQL pattern?
Is this food a bread or a loaf?
Email Account under attack (really) - anything I can do?
Joining CSV files in Ubuntu
Is there a way to do an inner join in Excel between 2 csv files?Emails in CSV B that are not in CSV AEditing CSV files in UbuntuSplit view of a CSV in BashHow to split CSV files as per number of rows specified?csv file to phpmyadmin import errorExcel: How to skip specific rows when importing a CSVImporting multiple CSV filesCSV Input for Merged Columns in Microsoft ExcelExport a csv file to multiple csv files with batch command
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I'd like to join csv files in Ubuntu.
file_A.csv:
ID_a, ID_b, a, b, c
key_a, A, a1, b1, c1
key_a, B, a2, b2, c2
key_b, A, a3, b3, c3
file_B.csv:
ID_a, ID_b, d, e, f
key_a, A, d1, e1, f1
key_a, B, d2, e2, f2
key_b, A, d3, e3, f3
join_AB.csv
ID_a, ID_b, a, b, c, d, e, f
key_a, A, a1, b1, c1, d1, e1, f1
key_a, B, a2, b2, c2, d2, e2, f2
key_b, A, a3, b3, c3, d3, e3, f3
The input CSV files should be joined on common columns in their header. Is there a stock solution to this, or should I write my own script to do it?
ubuntu csv
bumped to the homepage by Community♦ 2 days ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
I'd like to join csv files in Ubuntu.
file_A.csv:
ID_a, ID_b, a, b, c
key_a, A, a1, b1, c1
key_a, B, a2, b2, c2
key_b, A, a3, b3, c3
file_B.csv:
ID_a, ID_b, d, e, f
key_a, A, d1, e1, f1
key_a, B, d2, e2, f2
key_b, A, d3, e3, f3
join_AB.csv
ID_a, ID_b, a, b, c, d, e, f
key_a, A, a1, b1, c1, d1, e1, f1
key_a, B, a2, b2, c2, d2, e2, f2
key_b, A, a3, b3, c3, d3, e3, f3
The input CSV files should be joined on common columns in their header. Is there a stock solution to this, or should I write my own script to do it?
ubuntu csv
bumped to the homepage by Community♦ 2 days ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Duplicate: stackoverflow.com/questions/2619562/…
– jmetz
Jul 25 '12 at 16:12
add a comment |
I'd like to join csv files in Ubuntu.
file_A.csv:
ID_a, ID_b, a, b, c
key_a, A, a1, b1, c1
key_a, B, a2, b2, c2
key_b, A, a3, b3, c3
file_B.csv:
ID_a, ID_b, d, e, f
key_a, A, d1, e1, f1
key_a, B, d2, e2, f2
key_b, A, d3, e3, f3
join_AB.csv
ID_a, ID_b, a, b, c, d, e, f
key_a, A, a1, b1, c1, d1, e1, f1
key_a, B, a2, b2, c2, d2, e2, f2
key_b, A, a3, b3, c3, d3, e3, f3
The input CSV files should be joined on common columns in their header. Is there a stock solution to this, or should I write my own script to do it?
ubuntu csv
I'd like to join csv files in Ubuntu.
file_A.csv:
ID_a, ID_b, a, b, c
key_a, A, a1, b1, c1
key_a, B, a2, b2, c2
key_b, A, a3, b3, c3
file_B.csv:
ID_a, ID_b, d, e, f
key_a, A, d1, e1, f1
key_a, B, d2, e2, f2
key_b, A, d3, e3, f3
join_AB.csv
ID_a, ID_b, a, b, c, d, e, f
key_a, A, a1, b1, c1, d1, e1, f1
key_a, B, a2, b2, c2, d2, e2, f2
key_b, A, a3, b3, c3, d3, e3, f3
The input CSV files should be joined on common columns in their header. Is there a stock solution to this, or should I write my own script to do it?
ubuntu csv
ubuntu csv
asked Jul 25 '12 at 14:58
Andrew WoodAndrew Wood
77921020
77921020
bumped to the homepage by Community♦ 2 days ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 2 days ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Duplicate: stackoverflow.com/questions/2619562/…
– jmetz
Jul 25 '12 at 16:12
add a comment |
Duplicate: stackoverflow.com/questions/2619562/…
– jmetz
Jul 25 '12 at 16:12
Duplicate: stackoverflow.com/questions/2619562/…
– jmetz
Jul 25 '12 at 16:12
Duplicate: stackoverflow.com/questions/2619562/…
– jmetz
Jul 25 '12 at 16:12
add a comment |
2 Answers
2
active
oldest
votes
Try the join command:
NAME
join - join lines of two files on a common field
SYNOPSIS
join [OPTION]... FILE1 FILE2
DESCRIPTION
For each pair of input lines with identical join fields, write a line
to standard output. The default join field is the first, delimited by
whitespace. When FILE1 or FILE2 (not both) is -, read standard input.
So you should be able to do:
join file_A.csv file_B.csv > file_AB.csv
You may have to join your first and second fields into one for this to work though - as in essence they can be seen as one field anyway.
I just double checked and it seems to work as long as your files have the format e.g.:
file_A.csv
ID_aID_b, a, b, c
key_aA, a1, b1, c1
key_aB, a2, b2, c2
key_bA, a3, b3, c3
as I mentioned above.
I don't think this'll work for me. I'd need to do scripting to merge then split the ID columns, so is would be just as easy to script the join.
– Andrew Wood
Jul 25 '12 at 15:44
@ajwood: That's unfortunate - in that case some amount of scripting will likely be needed.
– jmetz
Jul 25 '12 at 16:04
@ajwood - see my comment on the question itself - there is a very similar question already posted on stackoverflow.
– jmetz
Jul 25 '12 at 16:17
add a comment |
Here is my solution in Python
import sys
import csv
def main(args):
# store each header we read
headers = []
# Intersect headers to get our keys
for arg in args:
with open(arg) as f:
curr = csv.reader(f).next()
headers.append(curr)
try:
keys = list( set(keys) & set(curr) )
except NameError:
keys = curr
# New header
header = list(keys)
for h in headers:
header += [ k for k in h if k not in keys ]
# Join data
data = {}
for arg in args:
with open(arg) as f:
reader = csv.DictReader(f)
for line in reader:
data_key = tuple([ line[k] for k in keys ])
if not data_key in data: data[data_key] = {}
for k in header:
try:
data[data_key][k] = line[k]
except KeyError:
pass
# Drop keys that are missing data (keys not present in all files)
for key in data.keys():
for col in header:
if key in data and not col in data[key]:
del( data[key] )
# Dump data
print ','.join(header)
for key in sorted(data):
row = [ data[key][col] for col in header ]
print ','.join(row)
if __name__ == '__main__':
sys.exit( main( sys.argv[1:]) )
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f453440%2fjoining-csv-files-in-ubuntu%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Try the join command:
NAME
join - join lines of two files on a common field
SYNOPSIS
join [OPTION]... FILE1 FILE2
DESCRIPTION
For each pair of input lines with identical join fields, write a line
to standard output. The default join field is the first, delimited by
whitespace. When FILE1 or FILE2 (not both) is -, read standard input.
So you should be able to do:
join file_A.csv file_B.csv > file_AB.csv
You may have to join your first and second fields into one for this to work though - as in essence they can be seen as one field anyway.
I just double checked and it seems to work as long as your files have the format e.g.:
file_A.csv
ID_aID_b, a, b, c
key_aA, a1, b1, c1
key_aB, a2, b2, c2
key_bA, a3, b3, c3
as I mentioned above.
I don't think this'll work for me. I'd need to do scripting to merge then split the ID columns, so is would be just as easy to script the join.
– Andrew Wood
Jul 25 '12 at 15:44
@ajwood: That's unfortunate - in that case some amount of scripting will likely be needed.
– jmetz
Jul 25 '12 at 16:04
@ajwood - see my comment on the question itself - there is a very similar question already posted on stackoverflow.
– jmetz
Jul 25 '12 at 16:17
add a comment |
Try the join command:
NAME
join - join lines of two files on a common field
SYNOPSIS
join [OPTION]... FILE1 FILE2
DESCRIPTION
For each pair of input lines with identical join fields, write a line
to standard output. The default join field is the first, delimited by
whitespace. When FILE1 or FILE2 (not both) is -, read standard input.
So you should be able to do:
join file_A.csv file_B.csv > file_AB.csv
You may have to join your first and second fields into one for this to work though - as in essence they can be seen as one field anyway.
I just double checked and it seems to work as long as your files have the format e.g.:
file_A.csv
ID_aID_b, a, b, c
key_aA, a1, b1, c1
key_aB, a2, b2, c2
key_bA, a3, b3, c3
as I mentioned above.
I don't think this'll work for me. I'd need to do scripting to merge then split the ID columns, so is would be just as easy to script the join.
– Andrew Wood
Jul 25 '12 at 15:44
@ajwood: That's unfortunate - in that case some amount of scripting will likely be needed.
– jmetz
Jul 25 '12 at 16:04
@ajwood - see my comment on the question itself - there is a very similar question already posted on stackoverflow.
– jmetz
Jul 25 '12 at 16:17
add a comment |
Try the join command:
NAME
join - join lines of two files on a common field
SYNOPSIS
join [OPTION]... FILE1 FILE2
DESCRIPTION
For each pair of input lines with identical join fields, write a line
to standard output. The default join field is the first, delimited by
whitespace. When FILE1 or FILE2 (not both) is -, read standard input.
So you should be able to do:
join file_A.csv file_B.csv > file_AB.csv
You may have to join your first and second fields into one for this to work though - as in essence they can be seen as one field anyway.
I just double checked and it seems to work as long as your files have the format e.g.:
file_A.csv
ID_aID_b, a, b, c
key_aA, a1, b1, c1
key_aB, a2, b2, c2
key_bA, a3, b3, c3
as I mentioned above.
Try the join command:
NAME
join - join lines of two files on a common field
SYNOPSIS
join [OPTION]... FILE1 FILE2
DESCRIPTION
For each pair of input lines with identical join fields, write a line
to standard output. The default join field is the first, delimited by
whitespace. When FILE1 or FILE2 (not both) is -, read standard input.
So you should be able to do:
join file_A.csv file_B.csv > file_AB.csv
You may have to join your first and second fields into one for this to work though - as in essence they can be seen as one field anyway.
I just double checked and it seems to work as long as your files have the format e.g.:
file_A.csv
ID_aID_b, a, b, c
key_aA, a1, b1, c1
key_aB, a2, b2, c2
key_bA, a3, b3, c3
as I mentioned above.
edited Jul 25 '12 at 15:13
answered Jul 25 '12 at 15:05
jmetzjmetz
79237
79237
I don't think this'll work for me. I'd need to do scripting to merge then split the ID columns, so is would be just as easy to script the join.
– Andrew Wood
Jul 25 '12 at 15:44
@ajwood: That's unfortunate - in that case some amount of scripting will likely be needed.
– jmetz
Jul 25 '12 at 16:04
@ajwood - see my comment on the question itself - there is a very similar question already posted on stackoverflow.
– jmetz
Jul 25 '12 at 16:17
add a comment |
I don't think this'll work for me. I'd need to do scripting to merge then split the ID columns, so is would be just as easy to script the join.
– Andrew Wood
Jul 25 '12 at 15:44
@ajwood: That's unfortunate - in that case some amount of scripting will likely be needed.
– jmetz
Jul 25 '12 at 16:04
@ajwood - see my comment on the question itself - there is a very similar question already posted on stackoverflow.
– jmetz
Jul 25 '12 at 16:17
I don't think this'll work for me. I'd need to do scripting to merge then split the ID columns, so is would be just as easy to script the join.
– Andrew Wood
Jul 25 '12 at 15:44
I don't think this'll work for me. I'd need to do scripting to merge then split the ID columns, so is would be just as easy to script the join.
– Andrew Wood
Jul 25 '12 at 15:44
@ajwood: That's unfortunate - in that case some amount of scripting will likely be needed.
– jmetz
Jul 25 '12 at 16:04
@ajwood: That's unfortunate - in that case some amount of scripting will likely be needed.
– jmetz
Jul 25 '12 at 16:04
@ajwood - see my comment on the question itself - there is a very similar question already posted on stackoverflow.
– jmetz
Jul 25 '12 at 16:17
@ajwood - see my comment on the question itself - there is a very similar question already posted on stackoverflow.
– jmetz
Jul 25 '12 at 16:17
add a comment |
Here is my solution in Python
import sys
import csv
def main(args):
# store each header we read
headers = []
# Intersect headers to get our keys
for arg in args:
with open(arg) as f:
curr = csv.reader(f).next()
headers.append(curr)
try:
keys = list( set(keys) & set(curr) )
except NameError:
keys = curr
# New header
header = list(keys)
for h in headers:
header += [ k for k in h if k not in keys ]
# Join data
data = {}
for arg in args:
with open(arg) as f:
reader = csv.DictReader(f)
for line in reader:
data_key = tuple([ line[k] for k in keys ])
if not data_key in data: data[data_key] = {}
for k in header:
try:
data[data_key][k] = line[k]
except KeyError:
pass
# Drop keys that are missing data (keys not present in all files)
for key in data.keys():
for col in header:
if key in data and not col in data[key]:
del( data[key] )
# Dump data
print ','.join(header)
for key in sorted(data):
row = [ data[key][col] for col in header ]
print ','.join(row)
if __name__ == '__main__':
sys.exit( main( sys.argv[1:]) )
add a comment |
Here is my solution in Python
import sys
import csv
def main(args):
# store each header we read
headers = []
# Intersect headers to get our keys
for arg in args:
with open(arg) as f:
curr = csv.reader(f).next()
headers.append(curr)
try:
keys = list( set(keys) & set(curr) )
except NameError:
keys = curr
# New header
header = list(keys)
for h in headers:
header += [ k for k in h if k not in keys ]
# Join data
data = {}
for arg in args:
with open(arg) as f:
reader = csv.DictReader(f)
for line in reader:
data_key = tuple([ line[k] for k in keys ])
if not data_key in data: data[data_key] = {}
for k in header:
try:
data[data_key][k] = line[k]
except KeyError:
pass
# Drop keys that are missing data (keys not present in all files)
for key in data.keys():
for col in header:
if key in data and not col in data[key]:
del( data[key] )
# Dump data
print ','.join(header)
for key in sorted(data):
row = [ data[key][col] for col in header ]
print ','.join(row)
if __name__ == '__main__':
sys.exit( main( sys.argv[1:]) )
add a comment |
Here is my solution in Python
import sys
import csv
def main(args):
# store each header we read
headers = []
# Intersect headers to get our keys
for arg in args:
with open(arg) as f:
curr = csv.reader(f).next()
headers.append(curr)
try:
keys = list( set(keys) & set(curr) )
except NameError:
keys = curr
# New header
header = list(keys)
for h in headers:
header += [ k for k in h if k not in keys ]
# Join data
data = {}
for arg in args:
with open(arg) as f:
reader = csv.DictReader(f)
for line in reader:
data_key = tuple([ line[k] for k in keys ])
if not data_key in data: data[data_key] = {}
for k in header:
try:
data[data_key][k] = line[k]
except KeyError:
pass
# Drop keys that are missing data (keys not present in all files)
for key in data.keys():
for col in header:
if key in data and not col in data[key]:
del( data[key] )
# Dump data
print ','.join(header)
for key in sorted(data):
row = [ data[key][col] for col in header ]
print ','.join(row)
if __name__ == '__main__':
sys.exit( main( sys.argv[1:]) )
Here is my solution in Python
import sys
import csv
def main(args):
# store each header we read
headers = []
# Intersect headers to get our keys
for arg in args:
with open(arg) as f:
curr = csv.reader(f).next()
headers.append(curr)
try:
keys = list( set(keys) & set(curr) )
except NameError:
keys = curr
# New header
header = list(keys)
for h in headers:
header += [ k for k in h if k not in keys ]
# Join data
data = {}
for arg in args:
with open(arg) as f:
reader = csv.DictReader(f)
for line in reader:
data_key = tuple([ line[k] for k in keys ])
if not data_key in data: data[data_key] = {}
for k in header:
try:
data[data_key][k] = line[k]
except KeyError:
pass
# Drop keys that are missing data (keys not present in all files)
for key in data.keys():
for col in header:
if key in data and not col in data[key]:
del( data[key] )
# Dump data
print ','.join(header)
for key in sorted(data):
row = [ data[key][col] for col in header ]
print ','.join(row)
if __name__ == '__main__':
sys.exit( main( sys.argv[1:]) )
answered Jul 25 '12 at 20:56
Andrew WoodAndrew Wood
77921020
77921020
add a comment |
add a comment |
Thanks for contributing an answer to Super User!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f453440%2fjoining-csv-files-in-ubuntu%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Duplicate: stackoverflow.com/questions/2619562/…
– jmetz
Jul 25 '12 at 16:12