Sorting rows of a csv file

I would like to sort a csv file that look like this

Timestamp,Message
2019-12-03T11:10:35,mymessage1
2019-12-03T10:10:10,mymessage2
2019-11-03T12:140:35,mymessage3

The sorting should be done on the Timestamp.
I have written the following code to import and sort the csv file

package main

import (
	"encoding/csv"
	"fmt"
	"os"
	"sort"
)

func readCsvFile(filePath string) [][]string {
	f, err := os.Open(filePath)
	if err != nil {
		fmt.Println(err)
		os.Exit(1)
	}
	defer f.Close()

	csvReader := csv.NewReader(f)
	records, err := csvReader.ReadAll()
	if err != nil {
		fmt.Println(err)
		os.Exit(1)
	}

	sort.Slice(records, func(i, j int) bool {
		return records[i][0] < records[j][0]
	})

	return records
}

func main() {
	records := readCsvFile("myfile.csv")
	fmt.Println(records)
}

My question is, how can I skip the header from the sorting?

Also, this code will run on a machine with limited resource available (most of them are used by another program). It is hard to figure out how much resource are available at any time, so considering that the csv file to sort is in the range of ~100.000 rows and a total file size of ~50 MB, is this approach the best in terms of efficiency? Or would you suggest a better approach that can lower the risk of hitting an out of memory error?

Thanks!

if header is first row add special condition so that any line would more than first.

In regard to the header question - before you do records, err := csvReader.ReadAll(), which reads all records from the csv file, do header, err := csvReader.Read(), which will read the first record (your header), then do your ReadAll() to read the rest and then you can return them separately.

Re: second question about limited resources, I guess what you are looking for is external merge sort algorithm.

1 Like

The solution was quite simple, in front of my eyes, but I could figure it out only after your suggestion.

sort.Slice(records[1:], func(i, j int) bool {
	return records[1:][i][0] < records[1:][j][0]
})

This will sort everything, except the header.

Will have a look at the external merge sort algorithm, and see if that would work out for my application.

https://play.golang.org/p/CRBlOvg7Nbw

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.