# State

## State Configuration

Before creating `State` or `BatchedStates` instances, you need to set up the abstract class `StateConfig`.

In [1]:
from linguaml.rl.state import StateConfig

StateConfig.lookback = 5

The `lookback` attribute is the number of hyperparameters the agent will look back before selecting an action.

## State Unit

In [2]:
from linguaml.rl.action import ActionConfig, Action
from linguaml.tolearn.family import Family
from linguaml.tolearn.hp.bounds import NumericHPBounds
from rich import print


ActionConfig.family = Family.SVC
ActionConfig.numeric_hp_bounds = NumericHPBounds.from_dict({
    "C": (0.1, 100),
    "gamma": (0.1, 100),
    "tol": (1e-5, 1e-3)
})

action = Action({
    "C": 0.1,
    "gamma": 0.5,
    "tol": 0.1,
    "kernel": 0,
    "decision_function_shape": 1
})

print(action)

In [3]:
from linguaml.rl.state import StateUnit

state_unit = StateUnit.from_action_and_reward(action, 0.7)

print(state_unit.data)

In [4]:
from linguaml.tolearn.performance import PerformanceResult

performance_result = PerformanceResult(
    hp_config=action.to_hp_config(),
    accuracy=0.7
)

print(performance_result)

In [5]:
print(StateUnit.from_performance_result(performance_result).data)

## Single State

In [6]:
from linguaml.rl.action import BatchedActions
import numpy as np

# Generate random actions
actions = BatchedActions.from_dict({
    "C": np.random.random(StateConfig.lookback),
    "kernel": np.random.randint(0, 4, size=StateConfig.lookback),
    "gamma": np.random.random(StateConfig.lookback),
    "tol": np.random.random(StateConfig.lookback),
    "decision_function_shape": np.random.randint(0, 2, size=StateConfig.lookback)
}).to_actions()

# Generate random rewards
rewards = np.random.random(StateConfig.lookback)

print("actinos:")
print(actions)

print("rewards: ")
print(rewards)

It is recommended to construct a `State` instance using the `from_actions_and_rewards` or `from_action_reward_pairs` class methods.

### From Actions and Rewards

Construct a state via `from_actions_and_rewards`:

In [7]:
from linguaml.rl.state import State

# Construct a state
state = State.from_actions_and_rewards(actions, rewards)

# Check that state's data
state.data

array([[0.68257752, 0.13638174, 0.47804164, 0.        , 0.        ,
        1.        , 0.        , 1.        , 0.        , 0.34778118],
       [0.23265525, 0.10378297, 0.11241113, 1.        , 0.        ,
        0.        , 0.        , 1.        , 0.        , 0.39122254],
       [0.48983385, 0.99002537, 0.43756674, 0.        , 0.        ,
        0.        , 1.        , 1.        , 0.        , 0.50390384],
       [0.45547674, 0.11806534, 0.81058683, 0.        , 0.        ,
        1.        , 0.        , 0.        , 1.        , 0.50052495],
       [0.11504021, 0.76019095, 0.1843311 , 0.        , 1.        ,
        0.        , 0.        , 1.        , 0.        , 0.34659101]])

Construct a state via `from_action_reward_pairs`:

In [8]:
action_reward_pairs = list(zip(actions, rewards))
print(action_reward_pairs)

In [9]:
state = State.from_action_and_reward_pairs(action_reward_pairs)

print(state.data)

### From Performance Results

The internal implementation of `from_performance_results` first converts each `PerformanceResult` to actions and rewards, and then calls `from_actions_and_rewards`.

Hence, it is recommended to use `from_actions_and_rewards` directly. But if the actions and rewards are not available, there is no choice but to use `from_performance_results`.

In [10]:
performance_results = [
    PerformanceResult(
        hp_config=action.to_hp_config(),
        accuracy=reward
    )
    for action, reward in action_reward_pairs
]

print(performance_results)

In [11]:
state = State.from_performance_results(performance_results)

print(state.data)

### To PyTorch Tensor

In [12]:
print(state.to_tensor())

## Batched States

In [13]:
from linguaml.rl.state import BatchedStates

batched_states = BatchedStates.from_states([state, state])

print(batched_states.data)
print(f"shape: {batched_states.data.shape}")

In [14]:
print(batched_states.to_tensor())