summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorPaul Buetow <paul@buetow.org>2023-09-05 17:41:02 +0300
committerPaul Buetow <pbuetow@mimecast.com>2023-09-07 15:32:33 +0300
commit05ef7d56f945242fecb97cf03a3a9abab47013ee (patch)
treea8230d3a827506d435dae9c48eda8d8df0368a0c /doc
parentf771066f175c7bde9fd5cbcf39ab855afd5d5786 (diff)
add CSV aggr example to docs
Diffstat (limited to 'doc')
-rw-r--r--doc/examples.md19
-rw-r--r--doc/logformats.md4
2 files changed, 22 insertions, 1 deletions
diff --git a/doc/examples.md b/doc/examples.md
index 26ce002..4937cc5 100644
--- a/doc/examples.md
+++ b/doc/examples.md
@@ -151,6 +151,25 @@ You can also use a file input pipe as follows:
dmap 'from STATS select $hostname,max($goroutines),max($cgocalls),$loadavg,lifetimeConnections group by $hostname order by max($cgocalls)'
```
+### Aggregating CSV files
+
+In essence, this works exactly like aggregating logs. All files operated on must be valid CSV files and the first line of the CSV must be the header. E.g.:
+
+```shell
+% cat example.csv
+name,lastname,age,profession
+Michael,Jordan,40,Basketball player
+Michael,Jackson,100,Singer
+Albert,Einstein,200,Physician
+% dmap --query 'select lastname,name where age > 40 logformat csv outfile result.csv' example.csv
+% cat result.csv
+lastname,name
+Jackson,Michael
+Einstein,Albert
+```
+
+DMap can also be used to query and aggregate CSV files from remote servers.
+
### Other serverless commands
The serverless mode works transparently with all other DTail commands. Here are some examples:
diff --git a/doc/logformats.md b/doc/logformats.md
index 839b050..dbf2051 100644
--- a/doc/logformats.md
+++ b/doc/logformats.md
@@ -10,8 +10,10 @@ You could either make your application follow the DTail default log format, or y
The following log formats are currently available out of the box:
* `default` - The default DTail log format
-* `generic` - A generic log format with a very simple set of fields
+* `generic` - A generic log format with a simple set of fields
* `generickv` - A simple log format expecting all log lines in form of `field1=value1|field2=value2|...`
+* `csv` - A simple CSV format expecting all files a comma separated CSV file. The first line of the file must be the CSV header.
+* `custom1` and `custom2` - Customizable log formats.
### Selecting a log format