Performance Result Buffer#
We need to keep track of the first several best-performing hyperparameter configurations of the model. These top configurations are stored in the hyperparameter configuration buffer.
Priority Queue#
Since the tuning results should be ordered by the accuracy of the model with a certain hyperparameter configuration setting on the validation dataset, the internal implementation of the tuning result buffer should be a priority queue.
We implemented our own PriorityQueue.
Suppose we have the following tuning results:
from linguaml.tolearn.families import SVCFamily
result1 = {
"hp_config": SVCFamily.hp()(
C=0.1,
tol=0.01,
gamma=0.01,
kernel="linear",
decision_function_shape="ovo"
),
"accuracy": 0.6
}
result2 = {
"hp_config": SVCFamily.hp()(
C=10,
tol=0.01,
gamma=0.1,
kernel="linear",
decision_function_shape="ovo"
),
"accuracy": 0.9
}
result3 = {
"hp_config": SVCFamily.hp()(
C=1000,
tol=0.01,
gamma=0.1,
kernel="rbf",
decision_function_shape="ovr"
),
"accuracy": 0.7
}
To put them into a priority queue, you must first define the order. You may do so by initializing the priority queue with a function that defines how the priority of an item is calculated.
You can then push each result into the priority queue.
from linguaml.collections import PriorityQueue
# Creates an empry queue
q = PriorityQueue(
get_priority=lambda result: -result["accuracy"]
)
# Push items
q.push(result1)
q.push(result2)
q.push(result3)
The method peek_first_n_items returns the first several items with the highest priorities.
q.peek_first_n_items(3)
If you wish to push a list of items, you may use the extend method.
q = PriorityQueue(
get_priority=lambda result: -result["accuracy"]
)
q.extend([
result1,
result2,
result3
])
q.peek_first_n_items(3)
Performance Result Buffer#
We have implemented a PerformanceResult class.
from linguaml.tolearn.families import SVCFamily
from linguaml.tolearn.performance import PerformanceResult
result1 = PerformanceResult(
hp_config=SVCFamily.hp()(
C=0.1,
tol=0.01,
gamma=0.01,
kernel="linear",
decision_function_shape="ovo"
),
accuracy=0.6
)
result2 = PerformanceResult(
hp_config=SVCFamily.hp()(
C=10,
tol=0.01,
gamma=0.1,
kernel="linear",
decision_function_shape="ovo"
),
accuracy=0.9
)
result3 = PerformanceResult(
hp_config=SVCFamily.hp()(
C=1000,
tol=0.01,
gamma=0.1,
kernel="rbf",
decision_function_shape="ovr"
),
accuracy=0.7
)
from linguaml.tolearn.performance import PerformanceResultBuffer
performance_result_buffer = PerformanceResultBuffer(capacity=2)
performance_result_buffer.push(result1)
performance_result_buffer.push(result2)
performance_result_buffer.push(result3)
performance_result_buffer.peek_first_n_high_performance_results(1)
[PerformanceResult(hp_config=SVCConfig(C=10.0, kernel='linear', gamma=0.1, tol=0.01, decision_function_shape='ovo'), accuracy=0.9)]
performance_result_buffer.to_list()
[PerformanceResult(hp_config=SVCConfig(C=10.0, kernel='linear', gamma=0.1, tol=0.01, decision_function_shape='ovo'), accuracy=0.9),
PerformanceResult(hp_config=SVCConfig(C=1000.0, kernel='rbf', gamma=0.1, tol=0.01, decision_function_shape='ovr'), accuracy=0.7)]