Vision-and-Language (VL) pre-training has shown great potential on many
...
Texts appearing in daily scenes that can be recognized by OCR (Optical
C...
Text based Visual Question Answering (TextVQA) is a recently raised chal...
This technical report attempts to provide efficient and solid kits addre...