The success of the GPT series proves that GPT can extract general inform...
Recent studies have shown that episodic reinforcement learning (RL) is n...
The knowledge distillation uses a high-performance teacher network to gu...
We study reward-free reinforcement learning (RL) with linear function
ap...
We study linear contextual bandits in the misspecified setting, where th...
Thanks to the power of representation learning, neural contextual bandit...
We study the model-based reward-free reinforcement learning with linear
...
The success of deep reinforcement learning (DRL) is due to the power of
...
Thompson Sampling (TS) is one of the most effective algorithms for solvi...
Actor-critic (AC) methods have exhibited great empirical success compare...
We apply Faster R-CNN to the detection of characters in namecard, in ord...