Job Opportunity

Contact if you are interested in our job opportunities (Please contact us if you have more than 2 first-author papers published in top-tier conferences such as NeurIPS, ICML, ICLR, ACL, EMNLP, NAACL, KDD, WWW, AAAI, IJCAI, SIGIR. We prefer Ph.D students. ).

Hiring Research Interns

The Search Mission Query Understanding (SMU) team in Amazon Search is looking for PhD students worldwide in NLP, Information Retrieval, and Data Mining to join us in 2023 as research interns. As a PhD intern, you will solve challenging problems such as knowledge graph construction and mining, entity recognition and linking, unsupervised learning and weak supervised Learning, multi-task and multi-lingual learning, user behavior graph mining, large scale machine learning and multi-turn user interaction modeling. You will enjoy the internship in the team if you like working on massive data sets, see you research results make hundreds of millions dollars production impact and publish you result in top conference. Our internship focuses on publications. In addition to highly competitive compensation, you will receive full support of your research and publication during your internship, including mentorship and advisory from top scholars in the field. We look for interns in Spring, Summer, and Fall. Contact:

Why Choose Us?

Scientists and Interns loves our team culture

  • Research and publication obsession: every intern has the full support to public a top conference paper
  • Massive data and massive hardware support
  • Strong academia collaboration on research
  • Research work that matters in production: our research work makes its way to production and real customer experience improvements
  • Freedom and fun: we support 20% side project, monthly team building, reading groups, hackathones, career coaching and mentorship

Image Credit: DALL·E 2

Research Areas

  • Learning with limited guidance: Our datasets are at massive trillion scale. Labeling a small portion of them is cost-prohibitive. Additionally the human labels volume varies by NLP tasks. We are doing active research in active learning, semi/weakly-supervised learning, transfer Learning, meta-learning, and multi-task learning to build E-commerce query understanding with limited human guidance
  • Knowledge Graph Mining: Products on Amazon forms a giant knowledge graph. We are doing research to extract structure knowledge from product textual and image descriptions. Additionally, we have a massive trillion scale heterogeneous graph of products, entities, queries, sellers, and customers. We are doing active research on graph mining to extract common sense knowledge from the graph
  • Deep Learning and NLP: We are actively work on NLP tasks for language modeling, query rewriting, entity linking, few-shot learning, text generation, and QnA.
  • Search and Recommendation: We work on classic information retrieval research for recommendation and multilingual search.
  • Large Scale Machine Learning: We work on Data Embedding and Compression, Model distillation, Large language model Pretraining, leveraging the trillion scale data and AWS new hardware.
  • Conversational Shopping: Shopping is a multi-turn engagement model. How can we use Contextual Bandits for Multi-Turn Engagement, Reinforcement Learning to make our search engine appears as an inteligent sales person?

Image Credit: DALL·E 2

Visiting Scholars

Team Members


  • Query Attribute Recommendation at Amazon Search
    Chen Luo, William Headean, Neela Avudaiappan, Haoming Jiang, Tianyu Cao, Qingyu Yin, Yifan Gao, Zheng Li, Rahul Goutam, Haiyang Zhang, Bing Yin (RecSys 2022)

  • Task-Agnostic Graph Explanations [pdf]
    Yaochen Xie, Sumeet Katariya, Xianfeng Tang, Edward Huang, Nikhil Rao, Karthik Subbian, Shuiwang Ji (NeurIPS 2022)

  • Learning to Sample and Aggregate: Few-shot Reasoning over Temporal Knowledge Graph [pdf][code]
    Ruijie Wang, Zheng Li, Dachun Sun, Shengzhong Liu, Jinning Li, Bing Yin, Tarek Abdelzaher (NeurIPS 2022)

  • Multilingual knowledge graph completion with self-supervised adaptive graph alignment [pdf][code][data]
    Zijie Huang, Zheng Li, Haoming Jiang, Tianyu Cao, Hanqing Lu, Bing Yin, Karthik Subbian, Yizhou Sun, Wei Wang (ACL 2022)

  • RETE: Retrieval-enhanced temporal event forecasting on unified query product evolutionary graph [pdf][code]
    Ruijie Wang, Zheng Li, Danqing Zhang, Qingyu Yin, Tong Zhao, Bing Yin, Tarek Abdelzaher (WWW 2022)

  • ROSE: Robust caches for Amazon product search
    Chen Luo, Vihan Lakshman, Anshumali Shrivastava, Tianyu Cao, Sreyashi Nag, Rahul Goutam, Hanqing Lu, Yiwei Song, Bing Yin (WWW 2022)

  • ALLIE: Active learning on large-scale imbalanced graphs
    Limeng Cui, Xianfeng Tang, Sumeet Katariya, Nikhil Rao, Pallav Agrawal, Karthik Subbian, Dongwon Lee (WWW 2022)

  • Massive text normalization via an efficient randomized algorithm
    Nan Jiang, Chen Luo, Vihan Lakshman, Yesh Dattatreya, Yexiang Xue (WWW 2022)

  • Search filter ranking with language-aware label embeddings
    Jacek Golebiowski, Felice Antonio Merra, Ziawasch Abedjan, Felix Biessmann (WWW 2022)

  • Condensing Graphs via One-Step Gradient Matching [pdf][code]
    Wei Jin, Xianfeng Tang, Haoming Jiang, Zheng Li, Danqing Zhang, Jiliang Tang, Ying Bin (SIGKDD 2022)

  • Can clicks be both labels and features? Unbiased behavior feature collection and uncertainty-aware learning to rank
    Tao Yang, Chen Luo, Hanqing Lu, Parth Gupta, Bing Yin, Qingyao Ai (SIGIR 2022)

  • CERES: Pretraining of graph-conditioned transformer for semi-structured session data
    Rui Feng, Chen Luo, Qingyu Yin, Bing Yin, Tuo Zhao, Chao Zhang (NAACL 2022)

  • Retrieval-Augmented Multilingual Keyphrase Generation with Retriever-Generator Iterative Training [pdf][code][data]
    Yifan Gao, Qingyu Yin, Zheng Li, Rui Meng, Tong Zhao, Bing Yin, Irwin King, Michael R. Lyu (NAACL 2022 (Finding))

  • SeqZero: Few-shot Compositional Semantic Parsing with Sequential Prompts and Zero-shot Models
    Jingfeng Yang, Haoming Jiang, Qingyu Yin, Danqing Zhang, Bing Yin, Diyi Yang (NAACL 2022 (Finding))

  • MetaTS: Meta Teacher-Student Network for Multilingual Sequence Labeling with Minimal Supervision
    Zheng Li, Danqing Zhang, Tianyu Cao, Ying Wei, Yiwei Song and Bing Yin (EMNLP 2021)

  • QUEACO: Borrowing treasures from weakly labeled behavior data for query attribute value extraction
    Danqing Zhang*, Zheng Li*, Tianyu Cao, Chen Luo, Tony Wu, Hanqing Lu, Yiwei Song, Bing Yin, Tuo Zhao and Qiang Yang (CIKM 2021, Industry track, * denotes equal contribution)

  • Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data [pdf][Codes]
    Haoming Jiang, Danqing Zhang, Tianyu Cao, Bing Yin, Tuo Zhao (ACL 2021, Long paper, oral)

  • Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning [pdf]
    Hui Liu, Danqing Zhang, Bing Yin and Xiaodan Zhu (NAACL 2021)

  • Graph-based Multilingual Product Retrieval in E-Commerce Search [pdf]
    Hanqing Lu, Youna Hu, Tong Zhao, Tony Wu, Yiwei Song and Bing Yin (NAACL 2021, Industry track)

  • Learn to Cross-lingual Transfer with Meta Graph Learning Across Heterogeneous Languages [pdf]
    Zheng Li, Mukul Kumar, William Headden, Bing Yin, Ying Wei, Yu Zhang, Qiang Yang (EMNLP 2020, Long paper, oral)

  • QUEEN: Neural Query Rewriting in E-commerce [pdf]
    Yaxuan Wang*, Hanqing Lu*, Yunwen Xu, Rahul Goutam, Yiwei Song and Bing Yin (WWW KMECommerce Workshop 2021)

  • Unsupervised Synonym Extraction for Document Enhancement in E-commerce Search [pdf]
    Hanqing Lu, Yunwen Xu, Qingyu Yin, Tianyu Cao, Boris Aleksandrovsky, Yiwei Song, Xianlong Fan and Bing Yin (WWW KMECommerce Workshop 2021)

  • Hierarchical Multi-label Classification of Queries to Browse Categories [pdf]
    Heran Lin, Pengcheng Xiong, Danqing Zhang, Fan Yang, Ryoichi Kato, Mukul Kumar, William Headden and Bing Yin (SIGIR ECommerce Workshop 2020 Best Paper)




Upcoming, Current, and Alumni

Image Credit: DALL·E 2