Abstract: We present TextMonkey, a large multimodal model (LMM) tailored for text-centric tasks. Our approach introduces enhancement across several dimensions: By adopting Shifted Window Attention ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results