Sunday, June 16, 2024

Rejections: Tale of Two Papers


My (rad) research team at EuroSys 2024; from left to right, Ehsan, me, Robert, Ties. After I introduced them to Gustavo Alonso, he asked if I had a height requirement for my PhD students.

I have already written about the rejections in academia once (in a post titled Rejections). This is a never-ending story for most academics. Recently, two papers from my group got published and presented at conference workshops after 5 and 4 rejections from conferences, respectively. Hence, I decided to revisit the topic of rejections by focusing on these two papers.

Before I start with the individual papers, I would like to acknowledge a few things.

First, I have bias with respect to my own work. The main reason I work on what I work on is because I find the topic exciting and important. Otherwise, I wouldn’t be working on it.

Second, like everyone I know, I have a drive to share what I find exciting and important with others. That is why I welcome any opportunity to present our work and enjoy writing about it in the form of academic papers. Through this dissemination process, we share and build knowledge, get constructive criticism from our peers to improve the work, and start collaborations and new research directions. Not everyone shares the same level of excitement on the same research topics, though, and there is always room for improvement in a work. As a result, sometimes the feedback you receive sounds discouraging. This discouragement combined with the high dependency of one’s career on publications makes rejections difficult even though we all know that they are inevitable in our profession.

Third, I am aware that my health is the most important thing, and nothing I do or accomplish at work makes me as happy as the time I spend with the people that make me feel at home or at a movie theater or the beach. But I don’t want to diminish people’s career ambitions, especially in a world where women’s career ambitions are still under-supported, by over-emphasizing these cliché-but-true health and happiness statements.

 

An Analysis of Collocation on GPUs for Deep Learning Training

Ties Robroek, Ehsan Yousefzadeh-Asl-Miandoab, Pınar Tözün

EuroMLSys 2024 - https://dl.acm.org/doi/10.1145/3642970.3655827

This paper characterizes the performance of the different task collocation methods available on NVIDIA GPUs for deep learning training. The motivation came after realizing that not everyone that trains deep learning models is <insert your favorite big tech company here>. Thus, not every model training needs many GPUs or even the entire resources of a single GPU. This means that if we always train one model at a time on a GPU, that GPU is likely a wasted resource. Wasting hardware resources is a waste of money and energy inefficient. Studying how deep learning tasks can effectively share the resources of a GPU, therefore, made sense and was a relatively under-researched subject at the time.

We started back in September 2021 when my first PhD student[1] Ties Robroek joined my 1-person team. A couple of MSc students, Anders Friis Kaas and Stilyan Petrov Paleykov, were also interested in the topic for their MSc thesis project. The initial team was formed. 

We started with an investigation into the MIG (multi-instance GPU) technology, since it was the newest thing offered by NVIDIA GPUs at the time. MIG allows a GPU to be split into smaller units enabling task collocation with isolation guarantees.

1st reject: The MSc students finished their thesis in June 2022. SoCC (ACM Symposium on Cloud Computing) paper submission deadline was around that time. We submitted the work from their thesis there. It got rejected with overall constructive and encouraging reviews. The main issue was people thought the paper didn’t have enough lessons-learned to warrant a SoCC publication. However, we got clear ideas for improving the paper for a possible resubmission. The key suggestion was to expand the study beyond MIG and add a comparison to other collocation methods on NVIDIA GPUs namely multi-streams and multi-process service (MPS).

2nd reject: For this submission, we included EhsanYousefzadeh-Asl-Miandoab, my second PhD student, in the study. The entire paper was almost redone. We submitted the outcome to MLSys (Conference on Machine Learning and Systems) in fall 2022. It got rejected again. While the reviews were slightly less encouraging than SoCC, they were overall constructive. The reviewers didn’t find the results surprising enough, asked for deeper analysis on some experiments, suggested adding more diverse deep learning models to the study, and asked for scenarios that involve multiple GPUs.

3rd reject: Following the MLSys reviews, for the next resubmission, we added more diverse models, dug deeper into certain results, changed the metrics we report to give a finer-grained picture for the GPU resource utilization, and wrote clearer guidelines for when it makes sense to use each collocation mechanism. The last point was to address the “no surprising result” comment. Since we cannot create surprising results out of nowhere, wrapping them up in clearer “take-away messages” made more sense. Finally, we deemed multi-GPU case out-of-scope for this study, since I strongly believe in the importance of optimizing things for the smaller scale as much as the big scale.

The resulting paper was submitted to ASPLOS (ACM International Conference on Architectural Support for Programming Languages and Operating Systems) in Spring of 2023 and was rejected once again. The workload diversity was praised by some reviewers, while some others asked for alternative workloads. However, the unsurprising results and the lack of deeper insights were once again the main issues.

4th reject: I was overall optimistic after the first two rejects, because we had clear ideas for improving the paper for a resubmission. I think the paper indeed got substantially better as a result of those resubmissions. However, after the 3rd reject, I didn’t know how to improve the paper anymore. We couldn’t mockup unstraightforward results. Further in-depth analysis wasn’t easy due to not every GPU hardware detail being openly shared by the vendor. Of course, one can always apply extra analysis through more profiling and add more workloads to the study if one has infinite time. However, I thought it would be better for the students to move onto the next stage/work in their PhD at this point. They also had the desire to move on. So, we decided to resubmit the paper to HPCA (IEEE International Symposium on High-Performance Computer Architecture) during the summer of 2023 without making extensive changes to it this time around. It got rejected with similar reviews to ASPLOS.

5th reject: In one last attempt, we resubmitted the paper to SIGMOD (ACM International Conference on Management of Data) in fall of 2024 with extra results but not substantial changes. I wasn’t sure if SIGMOD was the right venue for this type of work, but as a SIGMOD reviewer I have seen papers on utilizing GPU resources better for deep learning being welcomed by some, if not most, of the program committee. I also thought that the insights we deliver on GPUs may be interesting to the data systems community. We got rejected again mainly due to straightforward lessons-learned and the topic being a borderline fit for SIGMOD.

Accept: Finally, I decided to stop trying to force this paper into a conference. Even though the amount of work we put into it was a lot, our findings were clearly not enough for a conference publication. In my team, we really like the EuroMLSys workshop (Workshop on Machine Learning and Systems) that is collocated with the EuroSys conference. Therefore, it was a natural choice for us, and the paper got accepted with a presentation slot at EuroMLSys 2024.

 

Reaching the Edge of the Edge: Image Analysis in Space

Robert Bayer, Julian Priest, Pınar Tözün

DEEM 2024 – https://dl.acm.org/doi/10.1145/3650203.3663330

This paper characterizes the performance of several resource-constrained hardware devices to determine their suitability for an image-filtering task on a small (hence, extra-constrained) satellite.

The roots of this paper also go back to 2021, though, the actual work on our end didn’t start till Spring 2022. In 2021, Julian Priest joined our lab. He is the main representative of the DISCO (Danish Student CubeSat Program) at our university. DISCO is an educational project that involves several Danish universities. It gives the students the opportunity to design and operate a small satellite. The target use case is Earth Observation; more specifically taking images of Earth from the satellite and analyzing them. The challenge with this use case is that the communication link between the Earth and the satellite isn’t your typical on-Earth internet connection; it is weak and temporary. Hence, sending all the images captured on the satellite is not an option. There is a need for image filtering on the satellite to send to Earth only the images that are of substantial interest. This need for filtering images leads to a follow-up challenge: the computation power that can be deployed on a small satellite is also small due to both space and power restrictions of the satellite. Hence, there was a need to identify the hardware device(s) to deploy on such a satellite that can satisfy the required size, power, and image filtering latency.

Since I joined ITU, I have also been interested in analyzing the performance of a variety of small hardware devices. In general, I always look for good excuses for benchmarking hardware. :) Hence, DISCO was a fantastic excuse. We also had the perfect student to lead the work, Robert Bayer, who was a student assistant with me then and is now one of my PhD students.

1st reject: The hardware benchmarking for DISCO started in Spring 2022. I thought it could be interesting to write up about the results and submit something to CIDR (Conference on Innovative Data Systems Research) 2023. CIDR values papers on interesting and challenging data systems, and in my opinion the image processing pipeline of DISCO fits into this category. The reviewers, however, didn’t agree with me on the data systems connection, so the paper got rejected. Two out of three reviewers had a positive tone, otherwise.

2nd reject: After CIDR rejection, I thought the SIGMOD 2024 Data-Intensive Applications track could be a fit for this topic. This was also suggested by one of the positive CIDR reviewers. We added one more hardware device to our study, re-measured power consumption on all devices with a more precise external device, included details on the satellite components, and submitted the paper. Around the paper submission time, April 2023, the first DISCO satellite, built based on the results presented in the submitted paper, was launched in space. I thought this submission was the best paper I had ever co-authored in my entire career (no offense to the co-authors of my other papers), but no one else agreed. The paper got rejected once again mainly due to being a misfit for SIGMOD’s data management focus.

3rd reject: After two trials with data management venues, I thought it is better to target a systems venue as also suggested by some of the reviewers who rejected the paper. Thus, we made minor adjustments to the paper based on the feedback from previous reviews and submitted it to ASPLOS 2024’s summer 2023 round. We got more detailed feedback, since no one thought the paper was a misfit to ASPLOS. However, overall, the reviewers found the results not novel and surprising enough for ASPLOS and the focus on a single application too narrow, even though they all appreciated the motivation of the work. Hence, the paper was rejected once again.

10 days after receiving this rejection, Robert won the best Computer Science MSc thesis award in Denmark for the same work.

4th reject: When we received the 3rd reject, the submission deadlines for MLSys 2024 and EuroSys 2024 were already over. They would have been other relevant systems venues for this work. The other option, which was also recommended by one of the CIDR reviewers, was MobiSys, but this was a whole different world for me, and I wasn’t sure if I wanted to jump into a third community while already doing a bad job juggling data management and systems communities. Therefore, I recommended Robert to target VLDB’s (International Conference on Very Large Databases) Scalable Data Science track. Based on the call for papers, both Robert and I thought the paper’s topic fits there. We were wrong once again. The paper got rejected mainly due to being unfit for VLDB.

Accept: This paper was tied to a real-world application deployment; the first DISCO satellite. Hence, there wasn’t much room to improve the work to please conference reviewers. We could do more benchmarking, but the satellite was already in space based on our existing results. The paper as is had closure and real-world impact. Thus, to avoid delaying the publication further, I once again gave up on conferences and started to think about relevant workshops. Robert also needed to move on. I thought the DEEM (Data Management for End-to-End Machine Learning) workshop, which I like very much, collocated with SIGMOD conference would be a nice venue for this work. I emailed the workshop chairs to double-check the suitability of the topic for the workshop to avoid another “this is unfit” rejection. They kindly confirmed that the topic is in scope for them. So, we submitted the paper to DEEM 2024, and it got accepted.

 

I personally enjoy and value some conference workshops more than the main conference. Workshops gather the subset of people in a research community with similar research interests. They can be way more effective for exposing your work to the right audience than the conference itself. Similarly, the talks at a workshop in your research area are usually more relevant for you content-wise. So, I am happy that my students had a chance to present their hard work at these workshops that I very highly regard.

However, a workshop publication unfortunately doesn’t count as much as a conference publication on one’s CV when people evaluate you for academic positions or grant submissions. A couple of years ago, a postdoc candidate I wished to hire mentioned that I didn’t seem to have so many publications recently. This wasn’t the main reason he declined my offer in the end, but it was something he noted down, and I am sure others do the same. This is how our profession works.

It has been more than 6 years since I joined ITU and almost 3 years since I had my first PhD student. I still don’t have a conference paper with my own PhD students. If ITU had a more traditional tenure-track scheme, I wouldn’t have gotten the tenure. Earlier this year, I went down the rabbit hole trying to figure out what I was doing wrong and what I can do better in the future. The list was too long, but none of the answers were soothing. Deeper into the hole I questioned whether I was a shit advisor or a complete failure at my job. Tori Amos’ Crucify played over and over in my head, especially the lines “Nothing I do is good enough for you, so I crucify myself every day.” and “got enough guilt to start my own religion.”

Luckily, Crucify ends with “Never going back again to crucify myself every day.”

I know I made mistakes and misjudgments and will likely keep making them. I know the struggle is partly due to changing my research field and trying to build up my own research group from scratch without any starting funding. I know systems work takes time to get published; 2 years or more is the common case. I know the 3-year PhD duration in Denmark freaks me out as a result and makes me more impatient than I should be for publications. I know everyone’s papers get rejected; even the works of the people I admire. I know one of my favorite conferences, CIDR, was founded by people whose work was underappreciated and rejected by VLDB and SIGMOD. I know I still get invited for talks, and when I present my team’s work to others, I get positive feedback overall, unless people are lying to my face. I know many colleagues at ITU appreciate me. Most importantly, I know, at my job, regardless of the rejections, I learn a lot and get the most fulfillment from the work I do with my students. 



[1] Technically, I had a PhD student earlier through co-supervision. The co-supervision ended, and the student stayed with the other supervisor. That is why I count Ties as my first PhD student.