arXiv: 2406.02550 - Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks This paper