ELK#

ELK Elasticsearch Logstash Kibana.

Resources#

Installing Elasticsearch#

First pull the docker image and run it:

download and run the docker image
docker pull sebp/elk
.....
Digest: sha256:8e250160ac22d339e57ba20768137dbeca2187c94082959220569a9318f85134
Status: Downloaded newer image for sebp/elk:latest
metskem@athena:~$ docker run -p 5601:5601 -p 9200:9200 -p 5000:5000 -it --name elk sebp/elk
 * Starting Elasticsearch Server                                                                                                                                                                sysctl: setting key "vm.max_map_count": Read-only file system
                                                                                                                                                                                         [ OK ]
logstash started.
waiting for Elasticsearch to be up (1/30)
waiting for Elasticsearch to be up (2/30)
waiting for Elasticsearch to be up (3/30)
waiting for Elasticsearch to be up (4/30)
waiting for Elasticsearch to be up (5/30)
waiting for Elasticsearch to be up (6/30)
waiting for Elasticsearch to be up (7/30)
waiting for Elasticsearch to be up (8/30)
 * Starting Kibana4                                                                                                                                                                      [ OK ] 
[2015-11-14 14:42:40,076][INFO ][node                     ] [Ardroman] initialized
[2015-11-14 14:42:40,077][INFO ][node                     ] [Ardroman] starting ...
[2015-11-14 14:42:40,141][WARN ][common.network           ] [Ardroman] publish address: {0.0.0.0} is a wildcard address, falling back to first non-loopback: {172.17.1.56}
[2015-11-14 14:42:40,141][INFO ][transport                ] [Ardroman] publish_address {172.17.1.56:9300}, bound_addresses {[::]:9300}
[2015-11-14 14:42:40,197][INFO ][discovery                ] [Ardroman] elasticsearch/SGBOqCisRoK5aXakkplosQ
[2015-11-14 14:42:43,259][INFO ][cluster.service          ] [Ardroman] new_master {Ardroman}{SGBOqCisRoK5aXakkplosQ}{172.17.1.56}{172.17.1.56:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2015-11-14 14:42:43,335][WARN ][common.network           ] [Ardroman] publish address: {0.0.0.0} is a wildcard address, falling back to first non-loopback: {172.17.1.56}
[2015-11-14 14:42:43,336][INFO ][http                     ] [Ardroman] publish_address {172.17.1.56:9200}, bound_addresses {[::]:9200}
[2015-11-14 14:42:43,336][INFO ][node                     ] [Ardroman] started
[2015-11-14 14:42:43,337][INFO ][gateway                  ] [Ardroman] recovered [0] indices into cluster_state
[2015-11-14 14:42:55,965][INFO ][cluster.metadata         ] [Ardroman] [.kibana] creating index, cause [api], templates [], shards [1]/[1], mappings [config]
[2015-11-14 14:45:40,093][INFO ][cluster.metadata         ] [Ardroman] [logstash-2015.11.14] creating index, cause [auto(bulk api)], templates [logstash], shards [5]/[1], mappings [logs, _default_]
[2015-11-14 14:45:40,357][INFO ][cluster.metadata         ] [Ardroman] [logstash-2015.11.14] update_mapping [logs]
[2015-11-14 14:46:53,017][INFO ][cluster.metadata         ] [Ardroman] [.kibana] create_mapping [index-pattern]
[2015-11-14 14:47:42,004][INFO ][cluster.metadata         ] [Ardroman] [.kibana] update_mapping [config]
[2015-11-14 14:48:50,680][INFO ][cluster.metadata         ] [Ardroman] [.kibana] create_mapping [dashboard]

The docker images also runs logstash, which we don't need now (see further), we will send the processed logs directly to elasticsearch.

Then install filebeat of an active (web)server to get some real logdate to process:

download and install filebeat
root@apollo:~# curl -L -O https://download.elastic.co/beats/filebeat/filebeat_1.0.0-rc1_i386.deb
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 3390k  100 3390k    0     0  1443k      0  0:00:02  0:00:02 --:--:-- 1443k
 
root@apollo:~# dpkg -i filebeat_1.0.0-rc1_i386.deb 
Selecting previously unselected package filebeat.
(Reading database ... 190915 files and directories currently installed.)
Preparing to unpack filebeat_1.0.0-rc1_i386.deb ...
Unpacking filebeat (1.0.0~rc1) ...
Setting up filebeat (1.0.0~rc1) ...
Processing triggers for ureadahead (0.100.0-16) ...

root@apollo:~# dpkg --listfiles filebeat
/.
/usr
/usr/share
/usr/share/doc
/usr/share/doc/filebeat
/usr/share/doc/filebeat/changelog.Debian.gz
/usr/bin
/usr/bin/filebeat-god
/usr/bin/filebeat
/etc
/etc/filebeat
/etc/filebeat/filebeat.template.json
/etc/filebeat/filebeat.yml
/etc/init.d
/etc/init.d/filebeat
root@apollo:~# 

Then edit the /etc/filebeat/filebeat.yml file, set paths to /var/log/apache2/access.log, frequency to 3s and hosts: "athena:9200"

Next load the index template in Elasticsearch.

root@apollo:/etc/filebeat# curl -XPUT 'http://athena:9200/_template/filebeat?pretty' -d@/etc/filebeat/filebeat.template.json
{
  "acknowledged" : true
}
root@apollo:/etc/filebeat# 

And start filebeat:

root@apollo:/etc/filebeat# /etc/init.d/filebeat start
root@apollo:/var/log# ps -ef|grep filebeat|grep -v grep
root      6672     1  0 16:18 pts/1    00:00:00 /usr/bin/filebeat-god -r / -n -p /var/run/filebeat.pid -- /usr/bin/filebeat -c /etc/filebeat/filebeat.yml
root      6673  6672  4 16:18 pts/1    00:00:04 /usr/bin/filebeat -c /etc/filebeat/filebeat.yml

Finally (not documented at filebeat, but add an extra ** filebeat-* ** index to elastic search (basically copy from the default logstash-* index), ==> settings ==> Indices ==> Create new)

The net result of the above actions is that we do get data in elasticsearch, but all loglines are stored as one field called message.
What we want is that the apache logfile is parsed and we store alle fields (clientip, request, response code and so on) be stored in elasticsearch.
I spent several hours to find out how this should be done with filebeat, but could not find it, I guess it must be something with the filebeat.template.json.
Anyways, I continued with the classic logstash, see next chapter.

Installing (classic) logstash#

I first installed the logstash deb, and next created the following logstash config file :

logstash.conf
input {
  file {
    path => "/var/log/apache2/access.log"
    type => "apache2"
  }
}

filter {
  grok {
    match => { "message" => "%{IPORHOST:clientip} %{HTTPDUSER:ident} \[%{HTTPDATE:timestamp}\] %{NUMBER:timetaken} \"%{IPORHOST:vhost}\" \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent}" }
    add_field => [ "received_at", "%{@timestamp}" ]
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  }
  geoip {
     source => "clientip"
     target => "geoip"
     database => "/etc/logstash/GeoLiteCity.dat"
     add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
     add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
   }
}

output {
  elasticsearch { hosts => ["athena:9200","10.0.0.162:9200"] }
}


Make the user logstash part of the adm group (so it can read apache2 logfiles) and restart: /etc/init.d/logstash restart and there we have an logstash-* index in elasticsearch with all requested fields, hurray !