!!! Mesos
[{TableOfContents }]
!! Questions to be answered
! Assign only part of node resources to a slave ?
Can I run a mesos-slave on a node, but not dedicate all resources of that node to the cluster ?
\\For example if I want to run multiple clusters and run multiple slaves from different clusters on the same node.
This can be done with the --resources switch.
Create the file ''/etc/mesos-slave/resources'' with the following content :
{{{
cpus(*):0.3; mem(*):512; disk(*):6543; ports(*):[31000-32000]
}}}
Then you might have to remove ''rm -vf /tmp/mesos/meta/slaves/latest'' and do a ''systemctl restart mesos-slave''
! How to setup security, at least slave authentication ?
See the Mesos configuration documentation at :
* [https://docs.mesosphere.com/reference/mesos-master/]
* [https://docs.mesosphere.com/reference/mesos-slave/]
* [http://mesos.apache.org/documentation/latest/configuration/]
I got it, create the following files (and restarting master and slave), as the doc says you can create files for flags :
''A file named the same name as the flag may be placed in the /etc/mesos-master directory. So a /etc/mesos-master/hostname file containing the value of 10.141.141.10 is like running the master with the option --hostname=10.141.141.10''
:
{{{
/etc/mesos/mesos-master/authenticate ==> true
/etc/mesos/mesos-master/authenticate_slaves ==> true
/etc/mesos/mesos-master/credentials ==> /etc/mesos/mesos-config/mesos-master.passwd
/etc/mesos/mesos-slave/credential ==> /etc/mesos/mesos-config/mesos-slave.passwd
/etc/mesos/mesos-config/mesos-master.passwd ==> user password
/etc/mesos/mesos-config/mesos-slave.passwd ==> user password
}}}
Use the __mesos state__ command to verify:
{{{
[root@node1 conf]# mesos state
{
"lost_tasks": 0,
"build_user": "root",
"pid": "master@192.168.33.10:5050",
"build_time": 1427387871,
"finished_tasks": 0,
"unregistered_frameworks": [],
"id": "20150421-050455-169978048-5050-945",
"git_sha": "e890e2414903bb69cab730d5204f10b10d2e91bb",
"build_date": "2015-03-26 16:37:51",
"hostname": "node1",
"version": "0.22.0",
"log_dir": "/var/log/mesos",
"killed_tasks": 0,
"leader": "master@192.168.33.10:5050",
"deactivated_slaves": 0,
"failed_tasks": 0,
"start_time": 1429592695.33795,
"git_tag": "0.22.0",
"staged_tasks": 0,
"completed_frameworks": [],
"elected_time": 1429592710.071,
"orphan_tasks": [],
"activated_slaves": 0,
"frameworks": [],
"flags": {
"help": "false",
"zk": "zk://localhost:2181/mesos",
"recovery_slave_removal_limit": "100%",
"port": "5050",
"logbufsecs": "0",
"authenticate": "true",
"work_dir": "/var/lib/mesos",
"slave_reregister_timeout": "10mins",
"authenticators": "crammd5",
"authenticate_slaves": "true",
"framework_sorter": "drf",
"version": "false",
"log_dir": "/var/log/mesos",
"logging_level": "INFO",
"log_auto_initialize": "true",
"registry_strict": "false",
"registry_fetch_timeout": "1mins",
"root_submissions": "true",
"webui_dir": "/usr/share/mesos/webui",
"registry": "replicated_log",
"credentials": "/etc/mesos-config/mesos-master.passwd",
"allocation_interval": "1secs",
"zk_session_timeout": "10secs",
"quorum": "1",
"user_sorter": "drf",
"quiet": "false",
"registry_store_timeout": "5secs",
"initialize_driver_logging": "true"
},
"started_tasks": 0,
"slaves": []
}
}}}
But it all fails :
{{{
Apr 21 06:11:48 node1 mesos-master[945]: W0421 06:11:48.951118 1175 master.cpp:3866] Failed to authenticate slave(1)@192.168.33.10:5051: Failed to get list of mechanisms: SASL(-4): no mechanism available: Internal Error -4 in server.c near line 1757
Apr 21 06:11:48 node1 mesos-master[945]: I0421 06:11:48.954591 1175 master.cpp:3813] Authenticating slave(1)@192.168.33.10:5051
Apr 21 06:11:48 node1 mesos-master[945]: I0421 06:11:48.954753 1175 master.cpp:3824] Using default CRAM-MD5 authenticator
Apr 21 06:11:48 node1 mesos-master[945]: I0421 06:11:48.955693 1175 authenticator.hpp:170] Creating new server SASL connection
Apr 21 06:11:48 node1 mesos-master[945]: W0421 06:11:48.957067 1175 authenticator.hpp:213] Failed to get list of mechanisms: no mechanism available
}}}
%%warning I reverted back to no security %%
! Can I run without root ?
By default everything (master and slave) run with root. Can you run without root, and if so, what are the consequences ?
! What is the (CPU) overhead ?
I already noticed that it is significant. In a test setup with
* 1 master
* 5 slaves (slaves running in docker containers)
* 3 applications in marathon
* 38 instances total (Tasks)
The avg CPU% of a slave is about 15% constantly. And then there is a mesos-executor task for each task also eating up each 0.7% CPU.
If I add another application with 22 instances , totalling 60 tasks, the CPU% goes to about 25% !! (bad)
Looks like a [known issue|https://issues.apache.org/jira/browse/MESOS-2254]
! How robust is it ?
When you are a bit "rough" with marathon, for example scaling up a few applications short after each other, it crashes (and restarts) :
{{{
Apr 24 17:09:12 node1 marathon[12427]: F0424 17:09:12.267240 12462 check.hpp:79] Check failed: f.isReady()
Apr 24 17:09:12 node1 marathon[12427]: [2015-04-24 17:09:12,272] INFO 192.168.33.1 - - [24/Apr/2015:17:09:12 +0000] "GET /v2/deployments HTTP/1.1" 200 259 "http://192.168.33.10:8080/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:37.0) Gecko/20100101 Firefox/37.0" (mesosphere.chaos.http.ChaosRequestLog:15)
Apr 24 17:09:12 node1 marathon[12427]: *** Check failure stack trace: ***
Apr 24 17:09:12 node1 marathon[12427]: @ 0x7fef7b0e76ed google::LogMessage::Fail()
Apr 24 17:09:12 node1 marathon[12427]: @ 0x7fef7b0e942c google::LogMessage::SendToLog()
Apr 24 17:09:12 node1 marathon[12427]: @ 0x7fef7b0e72dc google::LogMessage::Flush()
Apr 24 17:09:12 node1 marathon[12427]: @ 0x7fef7b0e9d29 google::LogMessageFatal::~LogMessageFatal()
Apr 24 17:09:12 node1 marathon[12427]: @ 0x7fef7b0ddcd4 _checkReady<>()
Apr 24 17:09:12 node1 marathon[12427]: @ 0x7fef7b0dc4a0 Java_org_apache_mesos_state_AbstractState__1_1fetch_1get_1timeout
Apr 24 17:09:12 node1 marathon[12427]: @ 0x7fefa5960b82 (unknown)
Apr 24 17:09:32 node1 marathon: run_jar --zk zk://localhost:2181/marathon --master zk://localhost:2181/mesos
}}}
!! Install summary
A short summary of playing around with __ mesos , marathon and chronos __ .\\
Mostly provided by the [Mesosphere intro course| http://docs.mesosphere.com/intro-course].
add this option to your Vagrantfile:
{{{
config.vm.box_download_insecure = true
}}}
Login with vagrant@localhost:2222 pw=vagrant , or "vagrant ssh"
Install mesos:
{{{
sudo rpm -Uvh http://repos.mesosphere.io/el/7/noarch/RPMS/mesosphere-el-repo-7-1.noarch.rpm
sudo yum -y install mesos marathon
}}}
Install zookeeper, the distributed configuration service used by mesos:
{{{
sudo rpm -Uvh http://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.x86_64.rpm
sudo yum -y install zookeeper zookeeper-server
}}}
Initialize and start Zookeeper:
{{{
sudo -u zookeeper zookeeper-server-initialize --myid=1
sudo service zookeeper-server start
}}}
Install java: ''yum -y install java-1.8.0-openjdk''
Run the interactive zookeeper shell : ''/usr/lib/zookeeper/bin/zkCli.sh'' and issue some tests :
Start mesos master and slave :
{{{
systemctl start mesos-master
systemctl start mesos-slave
}}}Install mesos:
{{{
sudo rpm -Uvh http://repos.mesosphere.io/el/7/noarch/RPMS/mesosphere-el-repo-7-1.noarch.rpm
sudo yum -y install mesos marathon
}}}
Install zookeeper, the distributed configuration service used by mesos:
{{{
sudo rpm -Uvh http://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.x86_64.rpm
sudo yum -y install zookeeper zookeeper-server
}}}
Initialize and start Zookeeper:
{{{
sudo -u zookeeper zookeeper-server-initialize --myid=1
sudo service zookeeper-server start
}}}
Install java: ''yum -y install java-1.8.0-openjdk''
Run the interactive zookeeper shell : ''/usr/lib/zookeeper/bin/zkCli.sh'' and issue some tests :
Start mesos master and slave :
{{{
systemctl start mesos-master
systemctl start mesos-slave
}}}
Mesos webui available at http://192.168.33.10:5050
Play around a bit with mesos :
{{{
export MASTER=$(mesos-resolve `cat /etc/mesos/zk` 2>/dev/null)
mesos help
}}}
Bring up a second node, node2 at 192.168.33.12 :
Install mesos:
{{{
sudo rpm -Uvh http://repos.mesosphere.io/el/7/noarch/RPMS/mesosphere-el-repo-7-1.noarch.rpm
sudo yum -y install mesos marathon
}}}
Install zookeeper, the distributed configuration service used by mesos:
{{{
sudo rpm -Uvh http://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.x86_64.rpm
sudo yum -y install zookeeper zookeeper-server
}}}
Initialize and start Zookeeper:
{{{
sudo -u zookeeper zookeeper-server-initialize --myid=1
sudo service zookeeper-server start
}}}
Run the interactive zookeeper shell : ''/usr/lib/zookeeper/bin/zkCli.sh'' and issue some tests :
Edit zookeeper config at /etc/mesos/zk, change the IP address to the address of the master.
Start mesos slave :
{{{
sudo systemctl start mesos-slave
}}}
Make sure the nodes are DNS accessible (update /etc/hosts) .
Logging of marathon, be default, goes to syslog (/var/log/messages)
Running Tasks always have a port, and this port is webaccessible giving you access to stdout and stderr.
Messing with the marathon REST api (see [Marathon REST api|https://mesosphere.github.io/marathon/docs/rest-api.html[ ):
''curl --show-error --silent http://192.168.33.10:8080/metrics | python -m json.tool ''
Or, for example the following URLs :
* [http://192.168.33.10:8080/v2/apps]
* [http://192.168.33.10:8080/v2/apps/test] ==> more detail on app "test"
Delete an app:
''curl -X DELETE http://192.168.33.10:8080/v2/apps/test | python -m json.tool''
Create an app by posting the following data in (file app1.json) :
%%prettify
{{{
{
"id": "/app1",
"cmd": "python -m SimpleHTTPServer $PORT",
"args": null,
"user": null,
"env": {},
"instances": 3,
"cpus": 0.9,
"mem": 16.0,
"disk": 10.0,
"executor": "",
"constraints": [],
"uris": ["/testapp"],
"storeUrls": [],
"ports": [10000],
"requirePorts": false,
"backoffSeconds": 1,
"backoffFactor": 1.15,
"maxLaunchDelaySeconds": 3600,
"container": null,
"healthChecks": [],
"dependencies": [],
"upgradeStrategy": {
"minimumHealthCapacity": 1.0,
"maximumOverCapacity": 1.0
}
}
}}}
%%
''curl -v -H "Content-Type: application/json" -X POST --data @app1.json http://192.168.33.10:8080/v2/apps''
Now install chronos (the cron for mesos) :
{{{
sudo yum -y install chronos
sudo service chronos start
}}}
Chronos installs as a mesos framework, like marathon does. (marathon is a sort of init.d for mesos)
Chronos is available at [http://192.168.33.10:4400/]
Install the mesos command line utility :
{{{
curl "https://bootstrap.pypa.io/get-pip.py" -o "get-pip.py"
sudo python get-pip.py
sudo pip install virtualenv
sudo pip install mesos.cli
}}}
__Run mesos-slave in a docker container __
First create a slightly modified container from redjack/mesos-slave (just installing python to it) .
Use the following cmd to start the container :
%%small
{{{
docker run -d -e MESOS_LOG_DIR=/var/log -e MESOS_RESOURCES='cpus(*):0.3; mem(*):512; disk(*):6543; ports(*):[31000-32000]' -e MESOS_MASTER=zk://192.168.33.10:2181/mesos -e MESOS_HOSTNAME=slave3 -p 5051:5051 -p 8000:8000 --name slave3 --hostname slave3 harry:mesos-slave
}}}
%%
And if you want to start a few more :
%%small
{{{
docker run -d -e MESOS_LOG_DIR=/var/log -e MESOS_RESOURCES='cpus(*):0.8; mem(*):512; disk(*):6543; ports(*):[31000-32000]' -e MESOS_MASTER=zk://192.168.33.10:2181/mesos -e MESOS_HOSTNAME=slave3 -p 5051:5051 --name slave3 --hostname slave3 harry:mesos-slave
docker run -d -e MESOS_LOG_DIR=/var/log -e MESOS_RESOURCES='cpus(*):0.8; mem(*):512; disk(*):6543; ports(*):[31000-32000]' -e MESOS_MASTER=zk://192.168.33.10:2181/mesos -e MESOS_HOSTNAME=slave4 -p 5054:5051 --name slave4 --hostname slave4 harry:mesos-slave
docker run -d -e MESOS_LOG_DIR=/var/log -e MESOS_RESOURCES='cpus(*):0.8; mem(*):512; disk(*):6543; ports(*):[31000-32000]' -e MESOS_MASTER=zk://192.168.33.10:2181/mesos -e MESOS_HOSTNAME=slave5 -p 5055:5051 --name slave5 --hostname slave5 harry:mesos-slave
docker run -d -e MESOS_LOG_DIR=/var/log -e MESOS_RESOURCES='cpus(*):0.8; mem(*):512; disk(*):6543; ports(*):[31000-32000]' -e MESOS_MASTER=zk://192.168.33.10:2181/mesos -e MESOS_HOSTNAME=slave6 -p 5056:5051 --name slave6 --hostname slave6 harry:mesos-slave
docker run -d -e MESOS_LOG_DIR=/var/log -e MESOS_RESOURCES='cpus(*):0.8; mem(*):512; disk(*):6543; ports(*):[31000-32000]' -e MESOS_MASTER=zk://192.168.33.10:2181/mesos -e MESOS_HOSTNAME=slave7 -p 5057:5051 --name slave7 --hostname slave7 harry:mesos-slave
docker run -d -e MESOS_LOG_DIR=/var/log -e MESOS_RESOURCES='cpus(*):0.8; mem(*):512; disk(*):6543; ports(*):[31000-32000]' -e MESOS_MASTER=zk://192.168.33.10:2181/mesos -e MESOS_HOSTNAME=slave8 -p 5058:5051 --name slave8 --hostname slave8 harry:mesos-slave
docker run -d -e MESOS_LOG_DIR=/var/log -e MESOS_RESOURCES='cpus(*):0.8; mem(*):512; disk(*):6543; ports(*):[31000-32000]' -e MESOS_MASTER=zk://192.168.33.10:2181/mesos -e MESOS_HOSTNAME=slave9 -p 5059:5051 --name slave9 --hostname slave9 harry:mesos-slave
}}}
%%
Also make sure to edit the /etc/hosts and add an entry for this node (use the IP address of the docker container, not the host).
!! Logging
Create ''/etc/rsyslog.d/mesos.conf with following content :
%%prettify
{{{
if $programname == 'marathon' then {
action(type="omfile" file="/var/log/mesos/marathon.log")
}
if $programname == 'chronos' then {
action(type="omfile" file="/var/log/mesos/chronos.log")
}
if $programname == 'mesos-master' then {
action(type="omfile" file="/var/log/mesos/mesos-master.log")
}
if $programname == 'mesos-slave' then {
action(type="omfile" file="/var/log/mesos/mesos-slave.log")
}
}}}
%%
And look at ''/var/log/mesos/'' for the resulting files.