HiQ Advanced TopicsΒΆ

The metrics described in the previous chapter are enough for most of the use cases for system metrics. To gain more insights on business metrics, you need to customize HiQ.

Customized TracingΒΆ

HiQ is flexible so that you can customize it to trace other non-built-in metrics, such as business metrics. In order to customize it, you need to create your own class inheriting class hiq.HiQSimple and implement two functions def custom(self) and def custom_disable(self).

Log Metrics and Information to stdioΒΆ

The following is a code example to demo how to log information, including business metrics, into terminal. The target code is a call chain from main()-> func1() -> func2(). The arguments for the main function are two dictionaries: model and data. We know the data input has two keys img_path and size, and we want to log the values corresponding to the keys.

Target Code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
import time


def func1(model: dict, data: dict) -> int:
    time.sleep(1.5)
    r2 = func2(model, data)
    return r2 * 2


def func2(model: dict, data: dict) -> int:
    time.sleep(2.5)
    return len(data["img_path"])


def main(model: dict, data: dict) -> int:
    r = func1(model, data)
    return r


if __name__ == "__main__":
    res = main(model={"data": "abc"}, data={"img_path": "/tmp/hiq.jpg", "size": 1024})
    print(res)

Driver Code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import os
import hiq
from inspect import currentframe as cf
from hiq.constants import *


class MyHiQ(hiq.HiQSimple):
    def custom(self):
        @self.inserter
        def __my_main(data={}, model={}) -> int:
            if "img_path" in data:
                print(f"🌴 print log for img_path: {data['img_path']}")
            if "img_size" in data:
                print(f"🌴 print log for img_size: {data['img_size']}")
            return self.o_main(data=data, model=model)

        self.o_main = hiq.mod("main").main
        hiq.mod("main").main = __my_main

    def custom_disable(self):
        hiq.mod("main").main = self.o_main


def run_main():
    with hiq.HiQStatusContext():
        _ = MyHiQ()
        hiq.mod("main").main(
            model={"data": "abc"}, data={"img_path": "/tmp/hello.jpg", "img_size": 1024}
        )


if __name__ == "__main__":
    run_main()

In the custom() function, we define a new function called __my_main which has the same signature of the target code’s main function, and assign the target code’s main to self.o_main, assign __my_main to the target code’s main.

Inside the __my_main function, we check if there is img_path in the data argument. If there is, we log it. Finally we call self.o_main and return the result.

Run the driver code and get the output:

ℹ️  python examples/custom/stdio/main_driver.py
🌴 print log for img_path: /tmp/hello.jpg
🌴 print log for img_size: 1024

Without touching the target code, we logged one line of message into standard io console. This is useful for debugging purposes. We can also trace the information in HiQ Tree.

Trace Metrics and Information In HiQ TreeΒΆ

The target code will be the same as above. The difference here is we extract the information inside __my_main and define a function with decorator @self._inserter_with_extra(extra={}). extra will contain the information we want to trace. In this case, they are the image path and size.

Driver Code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import os
import hiq
from inspect import currentframe as cf
from hiq.constants import *


class MyHiQ(hiq.HiQSimple):
    def custom(self):
        def __my_main(data={}, model={}, *args, **kwargs) -> int:
            img_path = data["img_path"] if "img_path" in data else None
            img_size = data["img_size"] if "img_size" in data else None

            @self.inserter_with_extra(extra={"img": img_path, "size": img_size})
            def __z(data, model):
                return self.o_main(data=data, model=model)

            return __z(data, model)

        self.o_main = hiq.mod("main").main
        hiq.mod("main").main = __my_main

    def custom_disable(self):
        hiq.mod("main").main = self.o_main


def run_main():
    with hiq.HiQStatusContext():
        driver = MyHiQ()
        hiq.mod("main").main(
            model={"data": "abc"},
            data={"img_path": "/tmp/hello.jpg", "from": "driver", "img_size": 1024},
        )
        driver.show()


if __name__ == "__main__":
    run_main()

Run the driver code and get the output:

ℹ️  python examples/custom/hiqtree/main_driver.py
[2021-11-05 05:05:39.910686 - 05:05:43.914784]  [100.00%] 🟒_root_time(4.0041)
                                                            [{'img': '/tmp/hello.jpg', 'size': 1024}]
                                                            [OH:104us]
[2021-11-05 05:05:39.910686 - 05:05:43.914784]  [100.00%]    l___z(4.0041)

Under the tree’s root node, we can see the image path information and image size metric.

Log Monkey KingΒΆ

_images/lmk_cartoon_small.jpgHiQ Log Monkey King

LMK is a separate high performance logging system of HiQ. Sometimes we don’t need the structural information of the trace, we just need to log data into a file in the disk. In this case, we can use LMK.

To use LMK, an environment variable LMK must be enabled.

Log Metrics and Information to stdioΒΆ

Without extra setup, LMK will print out logging information in stdio.

Target Code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import time


def func1():
    time.sleep(1.5)
    func2()


def func2():
    time.sleep(2.5)


def main():
    func1()


if __name__ == "__main__":
    main()

Driver Code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
import os
import hiq

here = os.path.dirname(os.path.realpath(__file__))


def run_main():
    _ = hiq.HiQLatency(f"{here}/hiq.conf")
    hiq.mod("main").main()


if __name__ == "__main__":
    import time

    os.environ["LMK"] = "1"
    run_main()
    time.sleep(2)

At line 15, we set LMK equals to 1, which enables log monkey king. Run the code and we can get:

ℹ️  python examples/lmk/stdio/main_driver.py
2021-11-05 07:45:42.019567 - [time] [πŸ†” 2418220] 🐡 [main]
2021-11-05 07:45:42.020127 - [time] [πŸ†” 2418220] 🐡 [func1]
2021-11-05 07:45:43.521903 - [time] [πŸ†” 2418220] 🐡 [func2]
2021-11-05 07:45:46.024517 - [time] [πŸ†” 2418220] πŸ™ˆ [func2]
2021-11-05 07:45:46.024616 - [time] [πŸ†” 2418220] πŸ™ˆ [func1]
2021-11-05 07:45:46.024635 - [time] [πŸ†” 2418220] πŸ™ˆ [main]

The default log format is:

time_stamp - [metric name] [process id] monkey [function name] [extra information]

🐡 means function call is started, and πŸ™ˆ means function call is completed.

Log Metrics and Information to fileΒΆ

We can easily log the metrics and information into a file with LMK. LMK supports Python’s built-in logging module and third party logging module like loguru.

Python built-in logging moduleΒΆ

Target Code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import time


def func1():
    time.sleep(1.5)
    func2()


def func2():
    time.sleep(2.5)


def main():
    func1()


if __name__ == "__main__":
    main()

Driver Code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import logging
import os

import hiq

here = os.path.dirname(os.path.realpath(__file__))


LOG_FORMAT = "%(levelname)s %(asctime)s - %(message)s"

logging.basicConfig(
    filename="/tmp/lmk.log", filemode="w", format=LOG_FORMAT, level=logging.INFO
)

logger = logging.getLogger()


def run_main():
    _ = hiq.HiQLatency(f"{here}/hiq.conf", lmk_logger=logger)
    hiq.mod("main").main()


if __name__ == "__main__":
    import time

    os.environ["LMK"] = "1"
    run_main()
    time.sleep(2)
  • Explanation

Line 9-15: set up logging format, log file path and name

Line 19: pass logger as lmk_logger when constructing HiQLatency Object

Run the driver code, then you can see the log has been written into file /tmp/lmk.log:

ℹ️  python examples/lmk/logging/main_driver.py
ℹ️  cat /tmp/lmk.log 
INFO 2021-11-05 17:03:57,581 - 2021-11-05 17:03:57.580419 - [time] [πŸ†” 3568910] 🐡 [main]
INFO 2021-11-05 17:03:57,581 - 2021-11-05 17:03:57.581022 - [time] [πŸ†” 3568910] 🐡 [func1]
INFO 2021-11-05 17:03:59,083 - 2021-11-05 17:03:59.082735 - [time] [πŸ†” 3568910] 🐡 [func2]
INFO 2021-11-05 17:04:01,585 - 2021-11-05 17:04:01.585346 - [time] [πŸ†” 3568910] πŸ™ˆ [func2]
INFO 2021-11-05 17:04:01,585 - 2021-11-05 17:04:01.585472 - [time] [πŸ†” 3568910] πŸ™ˆ [func1]
INFO 2021-11-05 17:04:01,585 - 2021-11-05 17:04:01.585492 - [time] [πŸ†” 3568910] πŸ™ˆ [main]

Third-party Logging Library SupportΒΆ

LMK supports third-party logging libraries which conforms to the standard logging protocol. One example is loguru. loguru is an easy-to-use, asynchronous, thread-safe, multiprocess-safe logging library. You can install it by running:

pip install loguru

The target code is the same as above. This is the driver Code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import os

import hiq
from loguru import logger

here = os.path.dirname(os.path.realpath(__file__))


def run_main():
    _ = hiq.HiQLatency(
        f"{here}/hiq.conf", lmk_logger=logger, lmk_path="/tmp/lmk_guru.log"
    )
    hiq.mod("main").main()


if __name__ == "__main__":
    import time

    os.environ["LMK"] = "1"
    run_main()
    time.sleep(2)

Run the driver code, you can see the information is printed in the terminal:

_images/guru.jpg

The same information is also stored in the log file:

ℹ️  cat /tmp/lmk_guru.log
2021-11-05 17:45:54.346 | INFO     | hiq.monkeyking:consumer:69 - 2021-11-05 17:45:54.346130 - [time] [πŸ†” 3659097] 🐡 [main]
2021-11-05 17:45:54.347 | INFO     | hiq.monkeyking:consumer:69 - 2021-11-05 17:45:54.346699 - [time] [πŸ†” 3659097] 🐡 [func1]
2021-11-05 17:45:55.848 | INFO     | hiq.monkeyking:consumer:69 - 2021-11-05 17:45:55.848450 - [time] [πŸ†” 3659097] 🐡 [func2]
2021-11-05 17:45:58.351 | INFO     | hiq.monkeyking:consumer:69 - 2021-11-05 17:45:58.351059 - [time] [πŸ†” 3659097] πŸ™ˆ [func2]
2021-11-05 17:45:58.351 | INFO     | hiq.monkeyking:consumer:69 - 2021-11-05 17:45:58.351163 - [time] [πŸ†” 3659097] πŸ™ˆ [func1]
2021-11-05 17:45:58.351 | INFO     | hiq.monkeyking:consumer:69 - 2021-11-05 17:45:58.351182 - [time] [πŸ†” 3659097] πŸ™ˆ [main]

LumberJackΒΆ

_images/lumberjack.jpgLumberjack

Different from LMK, which writes log entry for each span, LumberJack is to handle an entire HiQ tree. For simplicity, we call it Jack. Jack is very useful in use cases where the overhead for processing metrics is so big that you cannot process each entry one by one. Kafaka is one Exmaple. Due to message encoding, network latency and response validation, a call to a Kafaka producer’s send_message can easily take more than 1 second. Jack is a good way to handle Kafka message. We can send metrics tree to Kafka and process it later with an analytics server. This will be described in details in section Integration with OCI Streaming.

Jack also writes a 500MB-rotated log in ~/.hiq/log_jack.log unless you set environmental variable NO_JACK_LOG.

$ tail -n3 ~/.hiq/log_jack.log
time,v2,0,{"None":1637008247.9725869,1637008251.9771237,{"__main":1637008247.9725869,1637008251.9771237,{"__func1":1637008247.972686,1637008251.9771047,{"__func2":1637008249.4744177,1637008251.977021,}}}}
time,v2,0,{"None":1637008251.9785185,1637008255.9829764,{"__main":1637008251.9785185,1637008255.9829764,{"__func1":1637008251.978641,1637008255.982966,{"__func2":1637008253.480345,1637008255.9829247,}}}}
time,v2,0,{"None":1637008255.983492,1637008259.9854834,{"__main":1637008255.983492,1637008259.9854834,{"__func1":1637008255.9836354,1637008259.9854727,{"__func2":1637008257.485351,1637008259.9854305,}}}}

Async and Multiprocessing in PythonΒΆ

  • TODO