The Goal:

You got blog going with Blogger and now ready to setup custom domain!

How to setup custom domain?

First, you will need to register your domain. To register your domain, you want to go through domain registrar. To become the domain registrar, you will need ICANN Acrediation so you want to go ahead register through registrar instead of becoming one.

Now there are many of them out there. The popular ones are likely GoDaddy. I see Google is providing this service too. I decided go to with NameCheap.com for a few reasons:

Now you’ve registered your domain. What’s next?

  1. Sign in to Blogger.
  2. From Upper left drop-down, Select the blog you want to update
  3. On the left menu, click Settings and then Basic.
  4. Under “Publishing,” click “+ Setup a 3rd party URL for your blog”.
  5. Type the URL of the domain you’ve purchased.
  6. Click Save.
  7. You’ll see an error with two CNAMEs.

    On “Name, Label or Host” column, 1st row it should show the subdomain you entered, like “blog” or “www”. For the case of this example, it is “subdomain” For destination, it should show “ghs.googlehosted.com” and common to everyone.

    On 2nd row destination, it is s different for each person and is specific to your blog and your Google Account.

  8. Now login to Cheapname and go to dashboard -> Domain List -> Advanced DNS
    1. Click “Add New Record” link at the bottom and select “CNAME Record”
    2. Enter value from “Name, Label or Host,” on Blogger as Host
    3. Enter value from “Destination, Target or Points to,” on Blogger which is “ghs.googlehosted.com” as value.
    4. Repeat the same for 2nd row on Blogger
  9. Wait for at least an hour for your DNS settings to activate.
  10. Repeat steps 1 through 5. Once you click ‘save’ (step 5) You should not get an error this time. Your blogspot.com address will redirect to your custom domain. It may take to 24 hours.

Cheers!

The problem:

You installed Elasticsearch on server. You can run curl localhost:9200 and all looks good but the access is denied from outside when curl <server-ip>:9200.

How to solve it

So first thing first. Elasticsearch do need to listen to ip you are accessing. To make it listen to all, you can simply change / add network.host: 0.0.0.0 to /etc/elasticsearch/elasticsearch.yml and restart elasticsearch server. Try curl <server-ip>:9200 and works? That’s great. Your server is configured / ready for port 9200.

If your access is rejected then there are several things you can check:

Is server running?

Make sure to run systemctl status elasticsearch (assuming that you are managing the service via systemctl) If it says active then you are good. If not, let’s start and test again.

● elasticsearch.service - Elasticsearch
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; disabled; vendor preset: enabled)
   Active: active (running) since Wed 2018-08-22 11:21:30 MST; 58min ago

Is port listening?

This is where I stuck. So you see 9200 is LISTEN only on tcp6 and not for ipv4. I got stuck with this idea of ES is not bind to ipv4. Later I found this is good. See here for more details. but if you google “es not binding to ipv4” there are quite hit and I was trying to apply the suggestion (e.g. set environment variable to force using ipv4 etc export ES_JAVA_OPTS="-Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Addresses") and had no luck of course because that wasn’t the problem as mentioned above.

root@bd-gpu01-s02:~# netstat -p tcp -na | grep 9200
tcp        0      0 10.102.111.221:43180    192.168.202.121:9200    ESTABLISHED 31786/node
tcp        0      0 10.102.111.221:43178    192.168.202.121:9200    ESTABLISHED 31786/node
tcp        0      0 10.102.111.221:43270    192.168.202.121:9200    ESTABLISHED 31786/node
tcp6       0      0 :::9200                 :::*                    LISTEN      113888/java
tcp6       0      0 10.102.111.221:9200     10.101.95.238:59009     ESTABLISHED 113888/java
tcp6       0      0 10.102.111.221:9200     10.101.95.238:59002     ESTABLISHED 113888/java
unix  3      [ ]         STREAM     CONNECTED     4089200  140511/python3.6

Is firewall allowing?

Yes, this is first thing I did right? but I was running Ubuntu so used ufw (Ubuntu Firewall). When I check the status, 9200 is is there to “ALLOW” as expected.

root@bd-gpu01-s02:~# ufw status
Status: active

To                         Action      From
--                         ------      ----
8899                       ALLOW       Anywhere
22                         ALLOW       Anywhere
5000                       ALLOW       Anywhere
80                         ALLOW       Anywhere
80/tcp                     ALLOW       Anywhere
Nginx HTTP                 ALLOW       Anywhere
9200                       ALLOW       Anywhere
8899 (v6)                  ALLOW       Anywhere (v6)
22 (v6)                    ALLOW       Anywhere (v6)
5000 (v6)                  ALLOW       Anywhere (v6)
80 (v6)                    ALLOW       Anywhere (v6)
80/tcp (v6)                ALLOW       Anywhere (v6)
Nginx HTTP (v6)            ALLOW       Anywhere (v6)
9200 (v6)                  ALLOW       Anywhere (v6)

but I just could not access! So just to be sure, from my host, I checked to see if 9200 is open on server

➜  ~ sudo nmap -p 9200 10.102.111.221
Password:

Starting Nmap 7.60 ( https://nmap.org ) at 2018-08-22 10:24 MST
Nmap scan report for es-01 (10.102.111.221)
Host is up (0.091s latency).

PORT     STATE    SERVICE
9200/tcp filtered wap-wsp

Then it shows “filtered”… wha~~~t??? Grrrr…

Okay, back to basic. Let’s check with iptable and see if all is good.

iptables -S

Note: -S option (or --list-rules) [chain]: Print all rules in the selected chain. If no chain is selected, all chains are printed like iptables-save. Like every other iptables command, it applies to the specified table (filter is the default).

Then I finally see the issue.

# iptables -S INPUT
-P INPUT DROP
-A INPUT -p tcp -m tcp --dport 80 -j ACCEPT
...
-A INPUT -j REJECT --reject-with icmp-host-prohibited
...
-A ufw-user-input -p tcp -m tcp --dport 9200 -j ACCEPT

The iptables rules will be processed in line order of the file. My newly added ufw-user-input (INPUT chain entered via ufw) was added at the very bottom. Below REJECT which rejects the packet.

ufw do have insert <position-number> but still puts below the INPUT REJECT on iptables. So what I end up doing is to insert at position 1 using iptables and finally worked… phew.

# insert a rule at line 1
iptables -I INPUT 1 -p tcp --dport 9200 -j ACCEPT

ufw is nice syntax but I guess in order to be able to use it, you want to have clean iptables to starts with.

Cheers!

The problem:

I need to ship specific log record and had formatter written in python. It is pretty complex transformation.

I thought of using Logstash but I then need to either convert this python logic or write a plugin to use already written python parser. Plus I need to install logstash… I wanted a simpler solution

How to solve it

Use custom python logging Handler and Filter!

import logging

messages = []
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)


class ListenFilter(logging.Filter):

    def filter(self, record):
        """Determine which log records to output.
        Returns 0 for no, nonzero for yes.
        """
        if record.getMessage().startswith('dont: '):
            return False
        return True


class RequestsHandler(logging.Handler):
    def emit(self, record):
        """Send the log records (created by loggers) to
        the appropriate destination.
        """
        messages.append(record.getMessage())


handler = RequestsHandler()
logger.addHandler(handler)

filter_ = ListenFilter()
logger.addFilter(filter_)

# log I want
logger.info("logme: Howdy!")


# log i want to skip
logger.info("dont: I'm doing great!")

# prints ['logme: Howdy!']
print(messages)

Cheers!

The problem:

When you start small Machine Learning team with a few projects, your experiment is done via Jupyter Notebook and maybe the notebook is in github. The notebook might contain a method to download data so it can be reproducible but it is getting harder and harder to track various experiments.

We also need to make sure models does not train on corrupted / skewed data and only high quality model are pushed to production. Currently these processes are manual and not centralized nor has unified common tool.

How to solve it

Typical ML workflow look like this: typical ml workflow

Facebook built FBLearner Flow but it is internal software toolset and not available for public to use. The platform manages:

  • Manage data
  • Train models
  • Evaluate models
  • Deploy models
  • Make predictions
  • Monitor predictions

Google’s TFX is available as 3 components: TensorFlow Transform (data transformation), TensorFlow Model Analysis and TensorFlow Serving. As name indicates, the toolset is narrowed down to TensorFlow. The platform manages:

  • Data ingestion
  • Data Analysis
  • Data Transformation
  • Data validation
  • Trainer
  • Model Evaluation and validation
  • Serving
  • Logging –> Data ingestion

Uber built michelangelo but just like FBLearner this is is an internal ML-as-a-service platform and not available for public.

At Databricks, a creator of Spark, announced MLflow: an open source machine learning platform!

Documentation is located here.

On 06/05/2018 they announced this on their blog and 3 days later today, there are already 1,596 stars on github

Quick starts

# ensure I have pipenv to create virtualenv
$ pip3 install pipenv --user
# install mlflow
$ pipenv install mlflow
# activate mlflow environment
$ pipenv shell

# run test experiment
$ git clone https://github.com/databricks/mlflow.git
$ python mlflow/example/quickstart/test.py

# start UI
$ mlflow ui -h 0.0.0.0

# then you should see the test experiment 

It consists of 3 components: ml_flow_components

  1. Tracking: For querying and recording data on experiments. Using the web UI, you can view and compare the output of multiple runs.
  2. Projects: Provides a simple format for reproducing code
  3. Models: For managing and deploying models into production

Would be nice to have data components to do (a) Analysis (b) Transformation (c) Validation but I want to watch / evaluate and see how this grows!

Also I should compare other available tools such as Amazon SageMaker and IDSIA sacred

More reporting to come.

Cheers!

The problem:

Ubuntu prompt me that 18.04 LST is available so I clicked Upgrade to initiate upgrade. When upgrade is completed, the system was rebooted and came back with lowest screen resolution you can ever imagine.

Fix

In summary…. This is only good for System 76 system. Looks like driver update was necessary. Anyhow…

So upgrade was completed

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04 LTS
Release:	18.04
Codename:	bionic

and looks like I lost lsb module. so restoring that…

sudo apt-get install lsb-core
$ lsb_release -a
LSB Version:	core-9.20170808ubuntu1-noarch:security-9.20170808ubuntu1-noarch
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04 LTS
Release:	18.04
Codename:	bionic

Download and install the current System76 Driver.

sudo apt-add-repository -ys ppa:system76-dev/stable
sudo apt-get update
sudo apt-get install -y system76-driver

If you ordered a system with a discrete NVIDIA graphics card, you will need to manually install the closed source drivers for your card to get the optimum performance. Please run the following command:

sudo apt install system76-driver-nvidia

Restart the computer and you should have your nice set of resolutions back.

Cheers!