1. Abernethy, J. D., Bartlett, P., & Rakhlin, A. (2007). Multitask learning with expert advice (Tech. Rep. UCB/EECS-2007-20). EECS Department, University of California, Berkeley.
2. Ando, R., & Zhang, T. (2005). A framework for learning predictive structure from multiple tasks and unlabeled data. Journal of Machine Learning Research (JMLR), 6, 1817–1853.
3. Arnold, A., Nallapati, R., & Cohen, W. W. (2008). Exploiting feature hierarchy for transfer learning in named entity recognition. In Association for computational linguistics (ACL).
4. Bakker, B., & Heskes, T. (2003). Task clustering and gating for Bayesian multi–task learning. Journal of Machine Learning Research (JMLR), 4, 83–99.
5. Ben-David, S., Blitzer, J., Crammer, K., & Pereira, F. (2006). Analysis of representations for domain adaptation. In Advances in neural information processing systems (NIPS).