Performance Result Buffer#

We need to keep track of the first several best-performing hyperparameter configurations of the model. These top configurations are stored in the hyperparameter configuration buffer.

Priority Queue#

Since the tuning results should be ordered by the accuracy of the model with a certain hyperparameter configuration setting on the validation dataset, the internal implementation of the tuning result buffer should be a priority queue.

We implemented our own PriorityQueue.

Suppose we have the following tuning results:

from linguaml.tolearn.families import SVCFamily

result1 = {
    "hp_config": SVCFamily.hp()(
        C=0.1,
        tol=0.01,
        gamma=0.01,
        kernel="linear",
        decision_function_shape="ovo"
    ),
    "accuracy": 0.6
}

result2 = {
    "hp_config": SVCFamily.hp()(
        C=10,
        tol=0.01,
        gamma=0.1,
        kernel="linear",
        decision_function_shape="ovo"
    ),
    "accuracy": 0.9
}

result3 = {
    "hp_config": SVCFamily.hp()(
        C=1000,
        tol=0.01,
        gamma=0.1,
        kernel="rbf",
        decision_function_shape="ovr"
    ),
    "accuracy": 0.7
}

To put them into a priority queue, you must first define the order. You may do so by initializing the priority queue with a function that defines how the priority of an item is calculated.

You can then push each result into the priority queue.

from linguaml.collections import PriorityQueue

# Creates an empry queue
q = PriorityQueue(
    get_priority=lambda result: -result["accuracy"]
)

# Push items
q.push(result1)
q.push(result2)
q.push(result3)

The method peek_first_n_items returns the first several items with the highest priorities.

q.peek_first_n_items(3)

If you wish to push a list of items, you may use the extend method.

q = PriorityQueue(
    get_priority=lambda result: -result["accuracy"]
)

q.extend([
    result1,
    result2,
    result3
])

q.peek_first_n_items(3)

Performance Result Buffer#

We have implemented a PerformanceResult class.

from linguaml.tolearn.families import SVCFamily
from linguaml.tolearn.performance import PerformanceResult

result1 = PerformanceResult(
    hp_config=SVCFamily.hp()(
        C=0.1,
        tol=0.01,
        gamma=0.01,
        kernel="linear",
        decision_function_shape="ovo"
    ),
    accuracy=0.6
)

result2 = PerformanceResult(
    hp_config=SVCFamily.hp()(
        C=10,
        tol=0.01,
        gamma=0.1,
        kernel="linear",
        decision_function_shape="ovo"
    ),
    accuracy=0.9
)

result3 = PerformanceResult(
    hp_config=SVCFamily.hp()(
        C=1000,
        tol=0.01,
        gamma=0.1,
        kernel="rbf",
        decision_function_shape="ovr"
    ),
    accuracy=0.7
)
from linguaml.tolearn.performance import PerformanceResultBuffer

performance_result_buffer = PerformanceResultBuffer(capacity=2)

performance_result_buffer.push(result1)
performance_result_buffer.push(result2)
performance_result_buffer.push(result3)

performance_result_buffer.peek_first_n_high_performance_results(1)
[PerformanceResult(hp_config=SVCConfig(C=10.0, kernel='linear', gamma=0.1, tol=0.01, decision_function_shape='ovo'), accuracy=0.9)]
performance_result_buffer.to_list()
[PerformanceResult(hp_config=SVCConfig(C=10.0, kernel='linear', gamma=0.1, tol=0.01, decision_function_shape='ovo'), accuracy=0.9),
 PerformanceResult(hp_config=SVCConfig(C=1000.0, kernel='rbf', gamma=0.1, tol=0.01, decision_function_shape='ovr'), accuracy=0.7)]