BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//wp-events-plugin.com//6.0.1.1//EN
TZID:Asia/Jerusalem
X-WR-TIMEZONE:Asia/Jerusalem
BEGIN:VEVENT
UID:16@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20201104T123000
DTEND;TZID=Asia/Jerusalem:20201104T133000
DTSTAMP:20201109T060825Z
URL:https://dds.technion.ac.il/iemevents/surprises-in-deep-learning-traini
ng/
SUMMARY:Surprises in Deep Learning Training [ \n Computational Data Sci
ence (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Dr. Daniel Soudry\n Advisors: \n Where: Zoom From:\nTechn
ion\nAbstract: We make a few interesting observations regarding deep neur
al networks (DNNS) training:\n\n1) DNNs are typically initialized with (a
) random weights in order to (b) break symmetry and (c) promote feature
diversity.\n\nWe demonstrate a\,b\, and c are not necessary at initializ
ation\, in order to obtain high accuracy at the end of the training. (ICML
2020)\n\n2) Large batch training is commonly used to accelerate training.
\n\nWe improve final accuracy by increasing the batch size with more data
augmentations\, instead of more samples. (CVPR2020)\n\n3) Quantization of
full precision trained DNNs while retaining high accuracy typically requi
res fine-tuning the model on a large dataset.\n\nWe generate synthetic da
ta from the DNN Batch-Norm statistics\; we then use it to fine-tune the DN
N\, without any real data (CVPR2020).\n\n4) Asynchronous training causes a
degradation in generalization\, even after training has converged to a st
eady-state.\n\nWe close this gap by adjusting hyper-parameters according
to a theoretical framework aiming to maintain minima stability. (ICLR2020)
\n\nZOOM
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:25@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20201125T123000
DTEND;TZID=Asia/Jerusalem:20201125T133000
DTSTAMP:20201115T054218Z
URL:https://dds.technion.ac.il/iemevents/predicting-the-utility-of-screeni
ng-human-ivf-embryos-for-complex-traits-and-diseases/
SUMMARY:Predicting the utility of screening human IVF embryos for complex t
raits and diseases [ \n Computational Data Science (CDS) Seminar\n
Seminars\n \n ]
DESCRIPTION:By: Shai Carmi \n Advisors: \n Where: ZOOM From:\nHebrew Unive
rsity\nAbstract: Extremely large genetic studies conducted in the past few
years have identified numerous mutations associated with complex traits a
nd diseases\, and have been translated into increasingly accurate predicti
on models. Popular predictors are based on "polygenic risk scores" (PRSs)\
, which are roughly counts of the number of risk (or trait) increasing mu
tations carried by an individual. One clinical application of PRSs\, perha
ps not originally envisioned by many researchers\, is the selection of hum
an IVF embryos for implantation based on their PRS for a complex disease o
r trait. Clearly\, prioritizing human embryos based on genetic scores is f
raught with ethical and social concerns\, from stigmatization and inequali
ty to eugenics. Nevertheless\, embryo screening is already offered to cons
umers\, with no oversight\, and with barely any research to support its cl
inical utility. In my talk\, I will describe statistical models and simula
tions that predict the utility of embryo screening. We show that when embr
yos are selected for implantation based on their predicted traits (e.g.\,
height or intelligence)\, the expected increase in trait is relatively sma
ll and subject to wide uncertainty. In contrast\, selecting embryos for re
duced predicted disease risk may achieve extremely large relative risk red
uctions for complex diseases such as schizophrenia or diabetes.\n\nLink to
Zoom
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:30@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20201209T123000
DTEND;TZID=Asia/Jerusalem:20201209T133000
DTSTAMP:20201130T082836Z
URL:https://dds.technion.ac.il/iemevents/learning-randomly-perturbed-struc
tured-predictors-for-direct-loss-minimization/
SUMMARY:Learning Randomly Perturbed Structured Predictors for Direct Loss M
inimization [ \n Computational Data Science (CDS) Seminar\n Semina
rs\n \n ]
DESCRIPTION:By: MSc Hedda Cohen (Technion)\n Advisors: Prof. Tamir Hazan\n
Where: Zoom From:\nTechnion\nAbstract: Direct loss minimization is a popul
ar approach for learning predictors over structured label spaces. This ap
proach is computationally ap-pealing as it replaces integration with optim
ization and allows to propagate gradients in a deep net using loss-pertur
bed prediction. Recently\, this technique was extended to generative model
s\, by introducing a randomized predictor that samples a structure from a
randomly perturbed score function. In this work\, we interpolate between
these techniques by learning the variance of randomized structured predict
ors as well as their mean\, in order to balance between the learned score
function and the randomized noise.\n\nWe extend the direct optimization
technique to\nlearn this balance\, in a high-dimensional structured label
setting.\n\nWe demonstrate empirically the effectiveness of learning th
is balance in two diverse structured discrete spaces.\n\n \;\n\nSemin
ar Zoom Link
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:34@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20201216T123000
DTEND;TZID=Asia/Jerusalem:20201216T123000
DTSTAMP:20201209T083116Z
URL:https://dds.technion.ac.il/iemevents/causalm-causal-model-explanation-
through-counterfactual-language-models/
SUMMARY:CausaLM: Causal Model Explanation Through Counterfactual Language M
odels [ \n Computational Data Science (CDS) Seminar\n Seminars\n
\n ]
DESCRIPTION:By: PhD Amir Feder\n Advisors: Prof. Roi Reichart\n Where: Zoo
m From:\nTechnion \nAbstract: Understanding predictions made by deep neur
al networks is notoriously difficult\, but also crucial to their dissemina
tion. As all ML-based methods\, they are as good as their training data\,
and can also capture unwanted biases. While there are tools that can help
understand whether such biases exist\, they do not distinguish between cor
relation and causation\, and might be ill-suited for text-based models and
for reasoning about high level language concepts. A key problem of estima
ting the causal effect of a concept of interest on a given model is that t
his estimation requires the generation of counterfactual examples\, which
is challenging with existing generation technology. To bridge that gap\, w
e propose CausaLM\, a framework for producing causal model explanations us
ing counterfactual language representation models. Our approach is based o
n fine-tuning of deep contextualized embedding models with auxiliary adver
sarial tasks derived from the causal graph of the problem. Concretely\, we
show that by carefully choosing auxiliary adversarial pre-training tasks\
, language representation models such as BERT can effectively learn a coun
terfactual representation for a given concept of interest\, and be used to
estimate its true causal effect on model performance. A byproduct of our
method is a language representation model that is unaffected by the tested
concept\, which can be useful in mitigating unwanted bias ingrained in th
e data.\n\nBio: Amir is a PhD student at the Technion - Israel Institute
of Technology\, working under the supervision of Prof. Roi Reichart. He
develops methods that integrate causality into natural language processing
to generate more robust and interpretable models. He is also interested i
n investigating and developing linguistically informed algorithms for pred
icting and understanding human behavior.\n\nAmir is currently a research i
ntern at Google’s Medical Brain team\, where he develops natural languag
e processing algorithms for clinical predictions.\n\nZoom Link
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:35@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20201223T123000
DTEND;TZID=Asia/Jerusalem:20201223T133000
DTSTAMP:20201209T083510Z
URL:https://dds.technion.ac.il/iemevents/perl-pivot-based-domain-adaptatio
n-for-pre-trained-deep-contextualized-embedding-models/
SUMMARY:PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextual
ized Embedding Models [ \n Computational Data Science (CDS) Seminar\n
Seminars\n \n ]
DESCRIPTION:By: PhD Eyal Ben David\n Advisors: Prof. Roi Reichart\n Where:
Zoom From:\nTechnion\nAbstract: Pivot-based neural representation models
have lead to significant progress in domain adaptation for NLP. However\,
previous works that follow this approach utilize only labeled data from th
e source domain and unlabeled data from the source and target domains\, bu
t neglect to incorporate massive unlabeled corpora that are not necessaril
y drawn from these domains. To alleviate this\, we propose PERL: A represe
ntation learning model that extends contextualized word embedding models s
uch as BERT with pivot-based fine-tuning. PERL outperforms strong baseline
s across 22 sentiment classification domain adaptation setups\, improves i
n-domain model performance\, yields effective reduced-size models and incr
eases model stability.\n\nBio: Eyal Ben-David is a PhD student in the Nat
ural Language Processing (NLP) group\, supervised by Roi Reichart\, at the
Faculty of Industrial Engineering and Management at the Technion - Israel
Institute of Technology. His research focuses on low-resource scenarios i
n NLP\, such as: domain adaptation\, zero-shot learning\, semi-supervised
learning\, and unsupervised learning. He also explores methods for integra
ting world knowledge into deep neural networks\, in order to improve model
's representations and achieve generalization beyond the training distribu
tion.\n\nZoom Link
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:36@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20210120T123000
DTEND;TZID=Asia/Jerusalem:20210120T133000
DTSTAMP:20201221T121714Z
URL:https://dds.technion.ac.il/iemevents/towards-low-resource-multilingual
-nlp-via-intuitive-question-answering/
SUMMARY:Towards Low-Resource Multilingual NLP via Intuitive Question Answer
ing [ \n Computational Data Science (CDS) Seminar\n Seminars\n \
n ]
DESCRIPTION:By: Dr. Gabriel Stanovsky\n Advisors: \n Where: Zoom From:\nHe
brew University of Jerusalem\nAbstract: Collecting high-quality annotated
data is a major bottleneck in developing multilingual NLP applications\, a
s it often demands formulating and adhering to rigorous linguistic annotat
ion guidelines which require considerable effort and expertise from task d
evelopers and recruited annotators. In this talk\, I will present a line o
f work which obviates the need for formal linguistic guidelines through si
mple QA annotation\, shown intuitive enough for large-scale\, non-expert a
nnotation\, leading to state-of-the-art models. I will outline the impleme
ntation of this approach in an ongoing work towards datasets and models fo
r 3 longstanding NLP tasks in 6 diverse languages (including Hebrew\, Arab
ic\, and Yiddish)\, and a promising model capable of transcribing missing
parts in cuneiform tablets in extinct languages (Akkadian and Sumerian).\n
Bio: Dr. Stanovsky is a senior lecturer at the Hebrew University of Jeru
salem. He did his postdoctoral research at the University of Washington an
d the Allen Institute for AI in Seattle\, working with Prof. Luke Zettlemo
yer and Prof. Noah Smith\, and his Ph.D. with Prof. Ido Dagan at Bar-Ilan
University. He is interested in developing text-processing models that exh
ibit facets of human intelligence with benefits for users in real-world ap
plications. His work has received awards at top-tier conferences\, includi
ng ACL and CoNLL\, and recognition in popular journals such as Science and
The New York Times.\nZoom Link
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:88@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20210621T100000
DTEND;TZID=Asia/Jerusalem:20210621T110000
DTSTAMP:20210615T094334Z
URL:https://dds.technion.ac.il/iemevents/action-robust-reinforcement-learn
ing/
SUMMARY:Action Robust Reinforcement Learning [ \n Computational Data Sc
ience (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: PhD Chen Tessler\n Advisors: Prof. Shie Mannor\n Where: ZOO
M From:\nTechnion\nAbstract:\n\nA policy is said to be robust if it maximi
zes the reward while considering a bad\, or even adversarial\, model. In t
his work we formalize two new criteria of robustness to action uncertainty
. We consider a case where the agent selects an action and with probabilit
y p a different\, possibly adversarial\, action will take place. We presen
t an algorithm and analyze its convergence from a tabular point of view. W
e then draw the similarities to practical schemes and show how to approxim
ate this learning scheme using deep learning tools. Our empirical results
show the ability of this method to produce robots that are robust to uncer
tainty.\n\nBio: I'm a 4th year PhD student under the supervision of Prof.
Shie Mannor. My PhD focuses on the intersection of theory and practice in
reinforcement learning\, looking at how to provide practical algorithms wi
th better guarantees.\n\n \;\n\nLink to Zoom\n\nhttps://technion.zoom.
us/j/91565759090\n\n \;
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:89@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20210622T123000
DTEND;TZID=Asia/Jerusalem:20210622T133000
DTSTAMP:20210615T094828Z
URL:https://dds.technion.ac.il/iemevents/domain-adaptation-with-category-s
hift-an-application-to-aspect-extraction/
SUMMARY:Domain Adaptation with Category Shift\, an Application to Aspect Ex
traction [ \n Computational Data Science (CDS) Seminar\n Seminars\
n \n ]
DESCRIPTION:By: M.Sc. Tony Lek\n Advisors: Prof. Roi Reichart and Dr. Yftah
Ziser\n Where: ZOOM From:\nTechnion\nAbstract:\n\nThe rise of pre-trained
language models has yielded substantial progress in the vast majority of
Natural Language Processing (NLP) tasks. However\, a generic approach towa
rds the pre-training procedure can naturally be sub-optimal in some cases.
Particularly\, fine-tuning a pre-trained language model on a source domai
n and then applying it to a different target domain\, results in a sharp
performance decline of the eventual classifier for many source-target dom
ain pairs. Moreover\, in some NLP tasks\, the output categories substantia
lly differ between domains\, making adaptation even more challenging. This
\, for example\, happens in the task of aspect extraction\, where the aspe
cts of interest of reviews of\, e.g.\, restaurants or electronic devices m
ay be very different. In this work we present a new fine-tuning scheme f
or BERT\, which aims to address the above challenges. We name this scheme
DILBERT: Domain Invariant Learning with BERT\, and customize it for aspe
ct extraction in the unsupervised domain adaptation setting. DILBERT harne
sses the categorical information of both the source and the target domains
to guide the pre-training process towards a more domain and category inva
riant representation\, thus closing the gap between the domains. We show t
hat DILBERT yields substantial improvements over state-of-the-art baseline
s while using a fraction of the unlabeled data\, particularly in more chal
lenging domain adaptation setups.\n\n \;\n\nZoom Link\n\nhttps://techn
ion.zoom.us/j/92960384508
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:91@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20210623T103000
DTEND;TZID=Asia/Jerusalem:20210623T113000
DTSTAMP:20210620T102248Z
URL:https://dds.technion.ac.il/iemevents/a-magical-mlmc-estimator-with-app
lications-to-stochastic-optimization/
SUMMARY:A magical MLMC estimator\, with applications to stochastic optimiza
tion [ \n Computational Data Science (CDS) Seminar\n Seminars\n
\n ]
DESCRIPTION:By: Yair Carmon \n Advisors: \n Where: ZOOM From:\nTel Aviv Un
iversity\nAbstract:\nMultilevel Monte Carlo (MLMC) is an estimation techni
que capable of producing cheap unbiased estimators - even when it appears
none should exist. In this talk\, we show how an MLMC estimator due to Bla
nchet and Glynn provides new primitives for stochastic optimization. In pa
rticular\, we combine it with standard gradient methods and obtain a nearl
y unbiased estimator for the minimizer of any strongly-convex optimization
problem\, whose variance and computational cost are nearly constant. By i
ncorporating our estimator into accelerated proximal point methods\, we re
solve the optimal complexity for minimizing the maximum of N convex functi
ons\, and also recover the best known complexity for projection-efficient
stochastic optimization (up to logarithmic factors). Finally\, we develop
another MLMC-based estimator for the gradient of distributionally robust o
ptimization objectives (including conditional value at risk)\, and use it
to obtain optimal rates of convergence.\n\nBased on works with Hilal Asi\,
John Duchi\, Arun Jambulapati\, Yujia Jin\, Daniel Levy and Aaron Sidford
.\n\nZoom Link\n\nhttps://technion.zoom.us/j/91227451164
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:93@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20210630T123000
DTEND;TZID=Asia/Jerusalem:20210630T133000
DTSTAMP:20210623T043623Z
URL:https://dds.technion.ac.il/iemevents/better-environments-for-better-ai
/
SUMMARY:Better Environments for Better AI [ \n Computational Data Scien
ce (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Sarah Keren\n Advisors: \n Where: ZOOM From:\nHarvard Univ
ersity and The Hebrew University\nAbstract: Most AI research focuses exclu
sively on the AI agent itself\, i.e.\, given some input\, what are the imp
rovements to the agent’s reasoning that will yield the best possible out
put? In my research\, I take a novel approach to increasing the capabiliti
es of AI agents via the use of AI to design the environments in which they
are intended to act. My methods identify the inherent capabilities and li
mitations of AI agents and find the best way to modify their environment i
n order to maximize performance.\n\n \;\n\nI will describe research pr
ojects that vary in their design objective\, in the AI methodologies that
are applied for finding optimal designs\, and in the real-world applicatio
ns to which they correspond. One example is Goal Recognition Design (GRD)\
, which seeks to modify environments to allow an observing agent to infer
the goals of acting agents as soon as possible. A second is Helpful Info
rmation Shaping (HIS)\, which seeks to find the minimal information to rev
eal to a partially-informed robot in order to guarantee the robot’s goal
can be achieved. I will also show how HIS can be used in a market of info
rmation\, where robots can trade their knowledge about the environment and
achieve an effective communication that allows them to jointly maximize t
heir performance. The third\, Design for Collaboration (DFC)\, considers a
n environment with multiple self-interested reinforcement learning agents
and seeks ways to encourage them to collaborate effectively. Throughout th
e talk\, I will discuss how the different frameworks fit within my overarc
hing objective of using AI to promote effective multi-agent collaboration
and to enhance the way robots and machines interact with humans.\n\nBio: S
arah Keren is a postdoctoral fellow at Harvard University and The Hebrew U
niversity of Jerusalem. This fall\, she will be joining the Computer Scien
ce Faculty at the Technion\, Israel Institute of Technology as a senior le
cturer (assistant professor). Sarah received her PhD from the Technion. He
r research focuses on providing theoretical foundations for AI systems tha
t are capable of effective collaboration with each other and with people.
She has received a number of awards\, including the ICAPS 2020 Best Disser
tation Honorable Mention\, the ICAPS 2014 Honorable Mention for Best Paper
\, the Eric and Wendy Schmidt Postdoctoral Award for Women in Mathematical
and Computing Sciences\, and the Weizmann Institute of Science National P
ostdoctoral Award for Advancing Women in Science.\n\n \;\n\nZoom Link\
n\nhttps://technion.zoom.us/j/94589142334\n\n \;\n\n \;
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:95@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20210706T123000
DTEND;TZID=Asia/Jerusalem:20210706T133000
DTSTAMP:20210630T095448Z
URL:https://dds.technion.ac.il/iemevents/variational-data-augmentation-wit
h-deep-learning-generative-models-for-imbalanced-learning/
SUMMARY:Variational Data Augmentation with Deep Learning Generative Models
for Imbalanced Learning [ \n Computational Data Science (CDS) Seminar\
n Seminars\n \n ]
DESCRIPTION:By: M.Sc Yanay Manheim\n Advisors: Prof. Tamir Hazan\n Where: Z
OOM From:\nTechnion\nAbstract: Deep Learning models are known for their ab
ility to successfully solve Machine Learning problems involving large amou
nts of data with evenly distributed classes. In contrast\, imbalanced lear
ning problems\, where some classes are underrepresented\, remains a challe
nge and often require creative solutions for them to be solved effectively
. This work will focus on balancing the classes distribution by generating
examples from the underrepresented class\, known as oversampling\, by usi
ng deep generative models based on Generative Adversarial Networks (GANs)
and Variational Auto Encoders (VAEs). Deep generative models are designed
to work with high dimensional (possibly structured) data\, as opposed to o
ther methods such as Synthetic Minority Over-sampling Technique (SMOTE) an
d Adaptive Synthetic Sampling Approach (ADASYN). This thesis proposes a di
rect variational generative discriminative network\, that is able to gener
ate instances conditioned on additional data\, such as class label. Our me
thod models an instance as a composition of discrete and continuous latent
features in a probabilistic model. The discrete and continuous variables
are sampled using Gumbel-max and standard Gaussian reparameterization resp
ectively. The resulting loss objective still includes an argmax operation\
, which can be estimated with softmax based relaxations. Instead\, the dis
crete loss is directly optimized by applying the direct optimization throu
gh argmax for discrete VAE approach. We use the encoding of data into a di
screte component as a classifier\, which means that when the training is d
one\, we are left with both a generator and a classifier. By augmenting th
e data\, the classifier is able to learn from a larger\, balanced and dive
rsified training set. While training\, synthetic data is generated and add
ed to every mini batch training set in order to improve classification per
formance and generalization\, which is measured by several evaluation metr
ics. Our proposed method effectiveness is demonstrated on two data sets an
d compared to the latest state of the art deep learning based generative-d
iscriminative algorithms.\n\nZoom Link\n\nhttps://technion.zoom.us/j/97933
825886
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:107@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20210714T123000
DTEND;TZID=Asia/Jerusalem:20210714T133000
DTSTAMP:20210708T051451Z
URL:https://dds.technion.ac.il/iemevents/measuring-the-robustness-of-vqa-s
ystems/
SUMMARY:Measuring the Robustness of VQA Systems [ \n Computational Data
Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: MSc. Daniel Rosenberg\n Advisors: Professor Roi Reichart\n
Where: ZOOM From:\nTechnion\nAbstract:\n\nDeep learning algorithms have sh
own promising results in visual question answering (VQA) tasks\, but a mor
e careful look reveals that they often do not understand the rich signal t
hey are being fed with. To understand and better measure the generalizatio
n capabilities of VQA systems\, we look at their robustness to counterfact
ually augmented data. Our proposed augmentations are designed to make a fo
cused intervention on a specific property of the question such that the an
swer changes. Using these augmentations\, we propose a new robustness meas
ure\, Robustness to Augmented Data (RAD)\, which measures the consistency
of model predictions between original and augmented examples. Through exte
nsive experimentation\, we show that RAD\, unlike classical accuracy measu
res\, can quantify when state-of-the-art systems are not robust to counter
factuals. We find substantial failure cases which reveal that current VQA
systems are still brittle. Finally\, we connect between robustness and gen
eralization\, demonstrating the predictive power of RAD for performance on
unseen augmentations.\n\nZoom Link\n\nhttps://technion.zoom.us/j/98617336
445
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:113@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20210914T133000
DTEND;TZID=Asia/Jerusalem:20210914T143000
DTSTAMP:20210914T051946Z
URL:https://dds.technion.ac.il/iemevents/model-compression-with-distillati
on-and-pruning-in-natural-language-processing/
SUMMARY:Model Compression with Distillation and Pruning in Natural Language
Processing [ \n Computational Data Science (CDS) Seminar\n Semina
rs\n \n ]
DESCRIPTION:By: MSc. Chen Badler\n Advisors: Professor Roi Reichart\n Where
: ZOOM From:\nTechnion\nAbstract:\nIn the natural language processing\, ne
ural networks are becoming increasingly deeper and complex. The recent sta
te-of-the-art model is the deep language representation model\, which incl
udes BERT\, ELMo\, and GPT. These developments have led to the conviction
that previous-generation\, shallower neural networks for language understa
nding are obsolete. However\, in this work\, we demonstrate how to use the
se large models to distill knowledge to a student shallow neural networks
which can be pruned aggressively in order to create a lightweight neural n
etwork 90\\% smaller than the original student network\, while maintaining
its original performance. We propose to distill knowledge from BERT\, a s
tate-of-the-art language representation model\, into a single-layer LSTM\,
and on this network apply pruning as well\, resulting a sparse\, fast net
work which distill BERT knowledge and outperform regular-trained LSTM. We
demonstrate our method across multiple datasets and tasks in natural langu
age processing.\n\nZoom Link\n\nhttps://technion.zoom.us/j/96121153839
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:115@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20211006T123000
DTEND;TZID=Asia/Jerusalem:20211006T133000
DTSTAMP:20210929T045943Z
URL:https://dds.technion.ac.il/iemevents/dual-decomposition-of-convex-opti
mization-layers-for-consistent-attention-in-medical-images/
SUMMARY:Dual Decomposition of Convex Optimization Layers for Consistent Att
ention in Medical Images [ \n Computational Data Science (CDS) Seminar
\n Seminars\n \n ]
DESCRIPTION:By: MSc. Tom Ron\n Advisors: Professor Tamir Hazan and Dr. Mich
al Weiler-Sagie\n Where: ZOOM From:\nTechnion\nAbstract:\n\nA key concern
in integrating machine learning models in medicine is the ability to inter
pret their reasoning. Popular explainability methods have demonstrated sat
isfactory results in natural image recognition\, yet in medical image anal
ysis\, many of these approaches provide partial and noisy explanations\, w
hich imposes reliability concerns. Recently\, attention mechanisms have sh
own compelling results both in their predictive performance and in their i
nterpretable qualities. A fundamental trait of attention is that it levera
ges salient parts of the input which contribute to the model's prediction.
To this end\, our work focuses on the explanatory value of attention weig
ht distributions. We propose a multi-layer attention mechanism for refined
explanations that enforces consistent interpretations between attended co
nvolutional layers using convex optimization. We apply duality to decompos
e the consistency constraints between the layers by reparameterizing their
attention probability distributions. We further suggest learning the dual
witness by optimizing with respect to our objective\; thus\, our implemen
tation uses standard back-propagation\, hence highly efficient. While pres
erving predictive performance\, our proposed method leverages weakly annot
ated medical imaging data and provides complete and faithful explanations
to the model's prediction.\n\n \;\n\nZoom link\n\nhttps://technion.zoo
m.us/j/96121153839
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:118@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20211027T143000
DTEND;TZID=Asia/Jerusalem:20211027T153000
DTSTAMP:20211018T080718Z
URL:https://dds.technion.ac.il/iemevents/covid-19-during-2021-in-israel-wa
ning-immunity-and-the-booster-dose-effect/
SUMMARY:COVID-19 during 2021 in Israel: waning immunity and the booster dos
e effect [ \n Computational Data Science (CDS) Seminar\n Seminars\
n \n ]
DESCRIPTION:By: Prof. Yair Goldberg\n Advisors: \n Where: Bloomfield 424 F
rom:\nTechnion\nAbstract:\nIn December 2020\, Israel began a mass vaccinat
ion campaign against COVID-19 administering the Pfizer BNT162b2 vaccine\,
which led to a sharp curtailing of the outbreak. After a period with almos
t no SARS-CoV-2 infections\, a resurgent COVID-19 outbreak began in mid-Ju
ne 2021. On July 30\, 2021\, a third (booster) dose of the vaccine was app
roved in Israel for individuals 60 years or older who had received the sec
ond dose at least five months previously. The booster vaccination campaign
in Israel was gradually expanded to younger age groups who received a sec
ond dose five or more months earlier. Using Israel's national database\, w
e quantify the extent of waning immunity of two doses against the Delta va
riant\, and estimate the difference in the rate of confirmed infection and
severe COVID-19 between recipients of only two doses of vaccine and those
also receiving a third booster dose. In this talk\, I present the data\,
the methods used to analyze them\, and the results that indicate the effec
tiveness of the booster dose for all age groups.\nThe work is joint with M
icha Mandel\, Yinon Bar-On\, Omri Bodenheimer\, Laurence Freedman\, Nir Ka
lkstein\, Barak Mizrahi\, Sharon Alroy-Preis\, Nachman Ash\, Amit Huppert\
, and Ron Milo.
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:119@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20211103T143000
DTEND;TZID=Asia/Jerusalem:20211103T153000
DTSTAMP:20211103T070530Z
URL:https://dds.technion.ac.il/iemevents/incentive-design-for-online-colla
borative-settings/
SUMMARY:Incentive Design for Online Collaborative Settings [ \n Computa
tional Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Prof. Kobi Gal \n Advisors: \n Where: Bloomfield 424 From
:\nBen-Gurion University \nAbstract:\n\nAdvances in technologies and inter
face design are enabling group activities of varying complexities to be ca
rried out\, in whole or in part\, over the internet\, with benefits to sci
ence and society (e.g.\, citizen science\, Massive Online Open Courses (
MOOC) and questions-and-answers sites). The need to support human interact
ion in such settings brings new and significant challenges to AI\; how to
provide incentives that keep participants motivated and productive\; how t
o provide useful\, information to system designers to help them decide whe
ther and how to intervene with the group's work\; how to scientifically ev
aluate the effects of AI interventions on the performance of individuals a
nd the group. I will describe ongoing projects in my lab that address thes
e challenges in three socially relevant settings — education\, Q&\;A
sites and citizen science--- and discuss potential ethical issues that ari
se from using AI for behavior change in the real world.
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:120@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20211110T143000
DTEND;TZID=Asia/Jerusalem:20211110T153000
DTSTAMP:20211028T090947Z
URL:https://dds.technion.ac.il/iemevents/the-role-of-childrens-vaccination
-for-covid-19-pareto-optimal-allocations-of-vaccines/
SUMMARY:The role of childrens’ vaccination for COVID-19 - Pareto-optimal
allocations of vaccines [ \n Computational Data Science (CDS) Seminar\
n Seminars\n \n ]
DESCRIPTION:By: Associate Professor Nir Gavish\n Advisors: \n Where: Bloom
field 424 and ZOOM From:\nTechnion\nAbstract:\n\nThe ultimate goal of COVI
D-19 vaccination campaigns is to enable the return of societies and econom
ies to a state of normality. While vaccines have been approved for childre
n of age 12 and older\, there is an ongoing debate as to whether children
should be vaccinated and at what priority. Different countries such as t
he USA and UK have adopted diverse policies on this issue\, while others r
emain undecided. In this work\, we use mathematical modeling and optimizat
ion to study the effect of vaccinating children on the epidemic spread. We
consider Pareto-optimal allocations according to competing measures of nu
mber of infections and mortality\, and systematically study the trade-offs
among them. When some weight is given to the number of infections\, we fi
nd that it is optimal to allocate vaccines to adolescents in age group 10-
19\, even when they are assumed to be less susceptible than adults. Additi
onally\, we find that in a broad range of scenarios\, optimal allocations
of vaccines do not include vaccination of age-group 0-9.\n\nJoint work wit
h Guy Katriel\n\nZoom Link\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:121@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20211117T143000
DTEND;TZID=Asia/Jerusalem:20211117T153000
DTSTAMP:20211028T091558Z
URL:https://dds.technion.ac.il/iemevents/research-in-the-data-rich-biomedi
cal-era-to-advance-precision-medicine/
SUMMARY:Research in the data-rich biomedical era to advance precision medic
ine [ \n Computational Data Science (CDS) Seminar\n Seminars\n \
n ]
DESCRIPTION:By: Assistant Professor Dvir Aran\n Advisors: \n Where: Bloomf
ield 424 and ZOOM From:\nTechnion\nAbstract:\n\nThe booming of new technol
ogies has transformed biomedical science into a digitized\, data-intensive
discipline. Computational researchers are tackling the challenge of integ
rating massive multidimensional biomedical data — including genomics and
clinical data — to advance precision medicine and improve therapeutic s
trategies. The talk will focus on recent studies I led to obtain new insig
hts from biomedical data\, including using diverse clinical data to predic
t labor and delivery complications\, the efficacy of treatments for cancer
patients\, and the heterogenous physiological response to SARS-CoV-2 infe
ction. In addition\, I will present xCell and SingleR\, computational meth
ods I developed to allow for a better understanding of cellular heterogene
ity in complex tissues using bulk and single-cell RNA-seq\, and their appl
ications for promoting precision medicine.\n\nZoom Link\n\nhttps://technio
n.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:122@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20211124T123000
DTEND;TZID=Asia/Jerusalem:20211124T133000
DTSTAMP:20211028T091703Z
URL:https://dds.technion.ac.il/iemevents/challenging-and-adapting-nlp-mode
ls-to-lexical-phenomena/
SUMMARY:Challenging and Adapting NLP Models to Lexical Phenomena [ \n C
omputational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: DR. Yuval Pinter\n Advisors: \n Where: Bloomfield 424 and
ZOOM From:\nBen-Gurion University \nAbstract:\n\nOver the last few years\,
deep neural models have taken over the field of natural language processi
ng (NLP)\, brandishing great improvements on many of its sequence-level ta
sks. But the end-to-end nature of these models makes it hard to figure out
whether the way they represent individual words aligns with how language
builds itself from the bottom up\, or how lexical changes in register and
domain can affect the untested aspects of such representations.\n\n \;
\n\nIn this talk\, I will present NYTWIT\, a dataset created to challenge
large language models at the lexical level\, tasking them with identificat
ion of processes leading to the formation of novel English words\, as well
as with segmentation and recovery of the specific subclass of novel blend
s. I will then present XRayEmb\, a method which alleviates the hardships o
f processing these novelties by fitting a character-level encoder to the e
xisting models' subword tokenizers\; and conclude with a discussion of the
drawbacks of current tokenizers' vocabulary creation schemes.\n\nZoom Lin
k\n\nhttps://technion.zoom.us/j/94950420992\n\n
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:141@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20211208T163000
DTEND;TZID=Asia/Jerusalem:20211208T173000
DTSTAMP:20211206T055938Z
URL:https://dds.technion.ac.il/iemevents/meta-level-techniques-for-plannin
g-search-and-scheduling/
SUMMARY:Meta-level techniques for planning\, search\, and scheduling [ \n
Computational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Dr. Shahaf S. Shperberg \n Advisors: \n Where: ZOOM From:\
nUniversity of Texas \nAbstract:\n\nMetareasoning is a core idea in AI tha
t captures the essence of being both human-like and intelligent. The idea
is that much can be gained by thinking (reasoning) about one's own thinkin
g. In the context of search and planning\, metareasoning involves making e
xplicit decisions about computation steps\, by comparing their `cost' in c
omputational resources\, against the gain they can be expected to make tow
ards advancing the search for a solution (or plan)\, and thus making bette
r decisions. To apply metareasoning\, a meta-level problem needs to be def
ined and solved with respect to a specific framework or algorithm. In some
cases\, these meta-level problems can also be very hard to solve (sometim
es even harder than the original search problem). Yet\, even a fast-to-com
pute approximation of meta-level problem solutions can yield good results
and improve the algorithms to which they are applied.\n\nThis talk focuses
on the development and evaluation of different metareasoning techniques\,
tailored for different problem settings\, designed to improve a variety o
f search\, planning and scheduling algorithms.\n\nZoom Link\n\n \;
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:136@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20211215T163000
DTEND;TZID=Asia/Jerusalem:20211215T173000
DTSTAMP:20211123T071306Z
URL:https://dds.technion.ac.il/iemevents/a-statistical-analysis-of-automat
ic-evaluation-metrics-for-summarization/
SUMMARY:A Statistical Analysis of Automatic Evaluation Metrics for Summariz
ation [ \n Computational Data Science (CDS) Seminar\n Seminars\n
\n ]
DESCRIPTION:By: Dr. Rotem Dror\n Advisors: \n Where: ZOOM From:\nUniversit
y of Pennsylvania\nAbstract:\n\nThe quality of a summarization evaluation
metric is quantified by calculating the correlation between its scores and
human annotations across a large number of summaries. Currently\, it is n
ot clear how precise these correlation estimates are\, nor whether differe
nces between two metrics’ correlations reflect a true difference or if
it is due to random chance. In this talk\, I will address these two proble
ms by proposing methods for calculating confidence intervals and running h
ypothesis tests for correlations. After evaluating which of the proposed m
ethods is most appropriate for summarization through two simulation experi
ments\, I will analyze the results of applying these methods to several di
fferent automatic evaluation metrics across three sets of human annotation
s. In this research\, we find that the confidence intervals are rather wid
e\, demonstrating high uncertainty in how reliable automatic metrics truly
are. Further\, although many metrics fail to show statistical improvement
s over ROUGE\, two recent works\, QAEval and BERTScore\, do in some evalua
tion settings. This work is accepted to TACL2021: https://arxiv.org/abs/2
104.00054\n\nI will also present an ongoing study that identifies two ways
in which the definition of the system-level correlation is inconsistent w
ith how metrics are used to evaluate summarization systems in practice and
propose changes to rectify this disconnect. The results from these analys
es point to the need for future research to focus on developing more consi
stent and reliable human evaluations of summaries.\n\nThis work was done i
n collaboration with Daniel Deutsch\, a PhD student from the Cognitive Com
putation Group at the Department of Computer and Information Science\, Uni
versity of Pennsylvania.\n\nZoom Link\n\nhttps://technion.zoom.us/j/949504
20992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:123@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20211222T143000
DTEND;TZID=Asia/Jerusalem:20211222T153000
DTSTAMP:20211219T053311Z
URL:https://dds.technion.ac.il/iemevents/integrative-structure-modeling-in
-the-age-of-deep-learning/
SUMMARY:Integrative Structure Modeling in the Age of Deep Learning [ \n
Computational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Assistant Professor Dina Schneidman\n Advisors: \n Where:
ZOOM From:\nHebrew University \nAbstract:\n\nIntegrative structure modelin
g is often used to characterize structures and dynamics of large macromole
cular assemblies by relying on multiple types of input information. The in
dividual proteins or domains are represented by atomic resolution structur
es or low-resolution sphere models and data from a variety of sources\, su
ch as cross-linking mass spectrometry\, cryo-Electron Microscopy\, Small A
ngle x-ray scattering is used to assemble the subunits. Recent progress in
protein folding enabled by deep learning by AlphaFold2 and RosettaFold pr
ovided an improved structural coverage for domains\, and even protein-prot
ein interactions used in Integrative Structure Modeling. However\, these m
ethods depend on multiple sequence alignment\, that is not available for i
mmune response complexes\, such as antibody-antigen interactions. Recently
\, we began utilizing deep learning approaches for a range of integrative
modeling tasks\, including development of scoring functions for prediction
of protein-protein or protein-peptide interactions\, modeling and docking
of antibodies and nanobodies to the antigens\, binding sites identificati
on\, and learning scoring functions for experimental data. I will describe
the progress and current challenges in deep learning applications for mod
eling of complexes.\n\nZoom Link\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:152@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220104T143000
DTEND;TZID=Asia/Jerusalem:20220104T153000
DTSTAMP:20220104T101838Z
URL:https://dds.technion.ac.il/iemevents/estimating-nlp-domain-adaptation-
performance-using-model-causal-analysis/
SUMMARY:Estimating NLP Domain Adaptation Performance Using Model Causal Ana
lysis [ \n Computational Data Science (CDS) Seminar\n Seminars\n
\n ]
DESCRIPTION:By: M.Sc. Boaz Ben-Dov\n Advisors: Prof. R. Reichart\n Where: Z
OOM From:\nTechnion\nAbstract:\n\nDomain adaptation setups were not all bo
rn equal\, and some domains are easier to adapt to and from than others.\n
\nThis talk will show and attempt to estimate the difficulty (or ease) of
adapting between different domains\, based\n\non the causal effect of cert
ain features in the data on the adapting model’s predictions. This quest
ion is\n\nrelevant in many real-life scenarios where computational resourc
es exist in relative abundance\, while labeling\n\nand data-gathering is t
ime-consuming\, expensive or otherwise problematic.\n\nWe will discuss app
roaches to inspecting NLP models\, leveraging existing labeled and unlabel
ed data which might\n\nnot seem immediately relevant\, and trying to reaso
n about the factors which affect NLP model performance.\n\n \;\n\nZoom
Link\n\nhttps://technion.zoom.us/j/97209155707
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:137@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220105T143000
DTEND;TZID=Asia/Jerusalem:20220105T153000
DTSTAMP:20211123T071726Z
URL:https://dds.technion.ac.il/iemevents/towards-democratizing-natural-lan
guage-processing/
SUMMARY:Towards Democratizing Natural Language Processing [ \n Computat
ional Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Dr. Omer Levy\n Advisors: \n Where: ZOOM From:\nTel Aviv U
niversity\nAbstract:\n\nThe methodology of pretraining large language mode
ls and then fine-tuning them on annotated examples has greatly advanced NL
P. However\, these advances come at rising computational costs in both tra
ining and inference time\, data collection and annotation challenges\, and
engineering expertise - all of which are necessary to apply state-of-the-
art NLP models to real tasks. In this talk\, we discuss recent steps towar
ds democratizing NLP technology\, and making it applicable to low-resource
settings. We cover new methods for efficient pretraining\, trade-offs bet
ween annotating more data and scaling up models\, and how to simulate targ
et-task data from unlabeled text. Finally\, we discuss a new paradigm that
aims to allow lay-users to define their own tasks on the fly\, in natural
language\, without the intervention of engineers or data scientists.\n\nZ
oom Link\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:154@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220112T143000
DTEND;TZID=Asia/Jerusalem:20220112T153000
DTSTAMP:20220102T083137Z
URL:https://dds.technion.ac.il/iemevents/two-tales-of-retrieval/
SUMMARY:Two tales of retrieval [ \n Computational Data Science (CDS) Se
minar\n Seminars\n \n ]
DESCRIPTION:By: professor Jonathan Berant \n Advisors: \n Where: Bloomfie
ld 424 and ZOOM From:\nTel Aviv University \nAbstract:\n\nThe field of NLP
has undergone a revolution in the past few years geared by the use of ve
ry large language models (LMs) that can learn to perform language understa
nding tasks given only a few examples. In this talk\, I will describe two
recent projects that use neural retrievers to address some shortcomings o
f large LMs. First\, while large LMs do well given a short piece of text\,
it is difficult to capitalize on their advantages for tasks that are at t
he corpus level (say all of Wikipedia). I will describe a self-supervised
method for retrieving paragraphs from a large corpus\, such as Wikipedia\,
that performs well even without any training examples. Second\, the behav
ior of very large LMs strongly depends on the examples that are given to t
hem as input (this is often termed in-context learning). We present a meth
od for retrieving examples from the training set that maximizes the perfor
mance of the LM on a downstream language understanding task. Our approache
s make it easier to build new question answering models for arbitrary corp
ora and to interact with large LMs that are provided as a service by comme
rcial companies.\n\n \;\n\nZoom Link\n\nhttps://technion.zoom.us/j/949
50420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:149@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220119T143000
DTEND;TZID=Asia/Jerusalem:20220119T153000
DTSTAMP:20211219T053908Z
URL:https://dds.technion.ac.il/iemevents/privacy-as-stability-for-generali
zation/
SUMMARY:: Privacy as Stability\, for Generalization [ \n Computational
Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Associate Professor Katrina Ligett\n Advisors: \n Where:
Zoom Link From:\nThe Hebrew University\nAbstract:\nMany data analysis pipe
lines are adaptive: the choice of which analysis to run next depends on th
e outcome of previous analyses. Common examples include variable selection
for regression problems and hyper-parameter optimization in large-scale m
achine learning problems: in both cases\, common practice involves repeate
dly evaluating a series of models on the same dataset. Unfortunately\, thi
s kind of adaptive re-use of data invalidates many traditional methods of
avoiding overfitting and false discovery\, and has been blamed in part for
the recent flood of non-reproducible findings in the empirical sciences.
An exciting line of work beginning with Dwork et al. in 2015 establishes t
he first formal model and first algorithmic results providing a general ap
proach to mitigating the harms of adaptivity\, via a connection to the not
ion of differential privacy.\nIn this talk\, we'll explore the notion of d
ifferential privacy and gain some understanding of how and why it provides
protection against adaptivity-driven overfitting. Many interesting questi
ons in this space remain open.\nJoint work with: Christopher Jung (UPenn)\
, Seth Neel (Harvard)\, Aaron Roth (UPenn)\, Saeed Sharifi-Malvajerdi (UPe
nn)\, and Moshe Shenfeld (HUJI).\n\nZoom Link\n\nhttps://technion.zoom.us/
j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:173@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220306T110000
DTEND;TZID=Asia/Jerusalem:20220306T120000
DTSTAMP:20220302T113430Z
URL:https://dds.technion.ac.il/iemevents/smooth-contextual-bandits/
SUMMARY:Smooth Contextual Bandits [ \n Computational Data Science (CDS)
Seminar\n Seminars\n \n ]
DESCRIPTION:By: Prof. Nathan Kallus \n Advisors: \n Where: Bloomfield 527\
, and ZOOM From:\nCornell University\nAbstract: Contextual bandit problems
model the inherent tradeoff between exploration and exploitation in perso
nalized decision making in marketing\, healthcare\, revenue management\, a
nd more. Specifically\, the tradeoff is characterized by the optimal growt
h rate of the regret in cumulative rewards compared to the optimal policy.
Naturally\, the optimal rate should depend on how complex the underlying
supervised learning problem is\, namely how much can observing reward in o
ne context tell us about mean rewards in another context. Curiously\, this
obvious-seeming relationship is obscured in current theory that separatel
y studies the easy\, fully-extrapolatable case and hard\, super-local case
. To characterize the relationship more precisely I study a nonparametric
contextual bandit problem where expected reward functions are β-smooth (r
oughly meaning β-times differentiable). I will show how this interpolates
between the two extremes previously studied in isolation: non-differentia
ble-response bandits (β ≤ 1)\, where rate-optimal regret is achieved by
decomposing the problem into non-contextual bandits\, and parametric-resp
onse bandits (β = ∞)\, where rate-optimal regret is often achievable wi
thout any exploration at all. We develop a novel algorithm that works for
any given smoothness setting by operating neither fully locally nor fully
globally. We prove its regret is rate-optimal\, thereby characterizing the
optimal regret rate and revealing a fuller picture of the crucial interpl
ay between complexity and regret in dynamic decision making. Time permitti
ng\, I will also discuss how to construct valid confidence intervals from
data collected by contextual bandits\, a crucial challenge in the enterpri
se to replace randomized trials with adaptive experiments in applied field
s from biostatistics to development economics.\n\nThis talk is based on th
e following papers:\nSmooth Contextual Bandits: Bridging the Parametric an
d Non-differentiable Regret Regimes\nhttps://pubsonline.informs.org/doi/ab
s/10.1287/opre.2021.2237\n\nPost-Contextual-Bandit Inference\nhttps://pape
rs.nips.cc/paper/2021/hash/eff3058117fd4cf4d4c3af12e273a40f-Abstract.html\
n\n \;\n\nZoom Link\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:176@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220320T123000
DTEND;TZID=Asia/Jerusalem:20220320T133000
DTSTAMP:20220315T071815Z
URL:https://dds.technion.ac.il/iemevents/dynamic-length-factorization-mach
ines-for-ctr-prediction/
SUMMARY:Dynamic Length Factorization Machines for CTR Prediction [ \n C
omputational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Yohay Kaplan\n Advisors: \n Where: Bloomfield 527 and ZOOM
From:\nYahoo\nAbstract: \n\nAd click-though rate prediction (pCTR) is on
e of the core tasks of online advertising. Driving the pCTR models of Yaho
o native advertising is OFFSET - a feature enhanced collaborative-filterin
g based event prediction algorithm. Due to data sparsity issues OFFSET mod
els both users and items by mapping their features into a latent space\, w
here the resulting user vector is a nonlinear function of the user feature
vectors (e.g.\, age\, gender\, hour\, etc.) which allows pairwise depende
ncies. This pairwise dependencies concept is also used by other algorithms
such as the Field-aware Factorization Machines (FFM). However\, both in O
FFSET and in FFM\, the different pairwise interactions are modeled by late
nt vectors of constant and equal lengths.\nIn this work we present a Dynam
ic Length Factorization Machines (DLFM) algorithm that dynamically optimiz
es the length of the vectors for each feature interaction during training\
, while not exceeding a maximal overall latent vector size.\nAn online eva
luation of this approach via bucket testing serving Yahoo native traffic s
howed a 1.46% revenue lift and a 2.15% CTR lift. Since integrated into pro
duction\, the DLFM has not only improved the accuracy of the model by opti
mizing the length of each latent space\, but has also reduced the total si
ze of the model by 25%. Although we applied the algorithm to OFFSET\, we s
how that DLFM can be applied to any FFM-like algorithm to optimize its pai
rwise feature vector lengths.\nWe also present an Educated Model Initializ
ation - a novel mechanism for initializing a new model based on an existin
g model that has some mutual user features. Using this mechanism\, we mana
ged to reduce the training time of our models by more than 90% when compar
ed to an equivalent model that is trained from "scratch".\n\n \;\n\nZo
om Link\n\nhttps://technion.zoom.us/j/94950420992\n\n \;
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:179@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220327T123000
DTEND;TZID=Asia/Jerusalem:20220327T133000
DTSTAMP:20220322T064607Z
URL:https://dds.technion.ac.il/iemevents/implicit-bias-in-machine-learning
/
SUMMARY:Implicit bias in machine learning [ \n Computational Data Scien
ce (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Ohad Shamir\n Advisors: \n Where: ZOOM From:\nWeizmann Ins
titute\nAbstract: Most practical algorithms for supervised machine learnin
g boil down to optimizing the average performance over a training dataset.
However\, it is increasingly recognized that although the optimization ob
jective is the same\, the manner in which it is optimized plays a decisive
role in the properties of the resulting predictor. For example\, when tra
ining large neural networks\, there are generally many weight combinations
that will perfectly fit the training data. However\, gradient-based train
ing methods somehow tend to reach those which\, on the one hand\, do not o
verfit\, and on the other hand\, are very brittle to adversarially crafted
examples. Why the dynamics of these methods lead to such "implicit biases
" is still far from being fully understood. In this talk\, I'll describe s
everal recent theoretical results related to this question\, in the contex
t of benign overfitting and adversarial examples.\n\nBased on joint work w
ith Gal Vardi and Gilad Yehudai.\n\nZoom Link\n\nhttps://technion.zoom.us/
j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:182@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220403T123000
DTEND;TZID=Asia/Jerusalem:20220403T133000
DTSTAMP:20220322T094821Z
URL:https://dds.technion.ac.il/iemevents/generalizing-to-new-distributions
-via-invariance-challenges-and-opportunities/
SUMMARY:Generalizing to New Distributions via Invariance: Challenges and Op
portunities [ \n Computational Data Science (CDS) Seminar\n Semina
rs\n \n ]
DESCRIPTION:By: Elan Rosenfeld\n Advisors: \n Where: Bloomfield 527 and ZO
OM From:\nMellon University\nAbstract: \nA central question of contemporar
y machine learning is how to ensure that a predictor will succeed on test
data which differs substantially from its training data. There is a growin
g body of work which attempts to solve this problem using invariance: thes
e methods hope to identify features whose informativeness of the target va
riable is stable across multiple training datasets and will remain useful
under major distribution shift. While this is a promising approach\, there
is little formal analysis of these methods\, making it difficult to under
stand if or when they can be expected to succeed. In this talk\, I will pr
esent a new probabilistic model and accompanying analysis which enables a
deeper understanding of how these algorithms perform when applied to moder
n high-dimensional data. First\, I will show that many proposed methods fo
r invariant deep learning—including the well-known Invariant Risk Minimi
zation—require an unreasonable number of training datasets to achieve th
eir stated goals and can rarely\, if ever\, be expected to outperform stan
dard Empirical Risk Minimization. I will then propose new algorithms inspi
red by this model which enjoy both theoretical and empirical benefits in t
erms of the number of datasets required to succeed—and additionally shed
light on promising future directions for successful out-of-distribution g
eneralization.\n\nZoom Link\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:189@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220410T123000
DTEND;TZID=Asia/Jerusalem:20220410T133000
DTSTAMP:20220329T073649Z
URL:https://dds.technion.ac.il/iemevents/predicting-decisions-in-language-
based-persuasion-games/
SUMMARY:Predicting Decisions in Language Based Persuasion Games [ \n Co
mputational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Reut Apel\n Advisors: \n Where: Bloomfield 527 and ZOOM Fr
om:\nTechnion\nAbstract: Sender-receiver interactions\, and specifically p
ersuasion games\, are widely researched in economic modeling and artificia
l intelligence\, and serve as a solid foundation for powerful applications
. However\, in the classic persuasion games setting\, the messages sent fr
om the expert to the decision-maker are abstract or well-structured applic
ation-specific signals rather than natural (human) language messages\, alt
hough natural language is a very common communication signal in real-world
persuasion setups. This talk addresses the use of natural language in per
suasion games\, exploring its impact on the decisions made by the players
and aiming to construct effective models for the prediction of these decis
ions.\n\nZoom Link\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:183@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220424T123000
DTEND;TZID=Asia/Jerusalem:20220424T133000
DTSTAMP:20220322T095401Z
URL:https://dds.technion.ac.il/iemevents/the-seeing-eye-robot-developing-a
-human-aware-artificial-collaborator/
SUMMARY:The Seeing-eye Robot - Developing a Human-aware Artificial Collabor
ator [ \n Computational Data Science (CDS) Seminar\n Seminars\n
\n ]
DESCRIPTION:By: Reuth Mirsky\n Advisors: \n Where: Bloomfield 527 and ZOOM
From:\nBar Ilan University \nAbstract:\n\nIn this talk I will present the
seeing-eye robot grand challenge and discuss the components required to d
esign and build a service robot that can replace or surpass the functional
ities of a seeing-eye dog. This challenge encompasses a variety of researc
h problems that can benefit from human-inspired AI: reasoning about other
agents\, human-robot interactions\, explainability\, teaching teammates\,
and more. For each of these problems\, I will present an example novel con
tribution that leverages the bilateral investigation of human and artifici
al intelligence. Finally\, I will discuss the many remaining challenges to
wards achieving a seeing-eye robot and how I plan to tackle these challeng
es.\n\nZoom Link\n\nhttps://technion.zoom.us/j/94950420992\n\n \;
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:184@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220501T123000
DTEND;TZID=Asia/Jerusalem:20220501T133000
DTSTAMP:20220322T095735Z
URL:https://dds.technion.ac.il/iemevents/alexa-lets-shop-personalized-shop
ping-assistance-in-ecommerce/
SUMMARY:Alexa\, Let's Shop: Personalized Shopping Assistance in eCommerce [
\n Computational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: David Carmel\n Advisors: \n Where: Bloomfield 527 and ZOOM
From:\nAmazon\nAbstract: In this talk I'll discuss the fundamentals of pe
rsonalized shopping assistance in eCommerce using voice-based AI assistant
devices. I'll cover some of the recent work done in our lab on personaliz
ed shopping experience through Amazon's Alexa\, with the focus on product
search and product question answering.\n\nZoom Link\n\nhttps://technion.zo
om.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:188@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220508T123000
DTEND;TZID=Asia/Jerusalem:20220508T133000
DTSTAMP:20220328T071406Z
URL:https://dds.technion.ac.il/iemevents/negative-controls-for-instrumenta
l-variables-2/
SUMMARY:Negative controls for instrumental variables [ \n Computational
Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Daniel Nevo\n Advisors: \n Where: Bloomfield 527 and ZOOM
From:\nTel Aviv University \nAbstract:\n\nIn the inevitable presence of un
measured confounding\, instrumental variable (IV) estimation is a widely-u
sed method utilizing auxiliary variables (the IVs) to study causal effects
. Crucially\, the validity of IV methodology relies on exclusion restricti
on assumptions that are not directly testable and that are therefore chall
enging to assess empirically. In this talk\, we will formalize the logic o
f falsification tests based on negative controls\, a set of tests that lev
erage contextual knowledge about the absence of causal links (for example\
, from treatment to past outcomes) to test the validity of candidate instr
uments. We will establish that IV negative control tests can be mapped to
a class of prediction problems that can leverage current machine learning
methods. Furthermore\, these prediction algorithms are combined with a sta
tistical hypothesis testing framework to provide a clear decision rule. Th
ese methods will be particularly applicable to research using large datase
ts with many candidate variables that can be used as negative controls\, a
n increasingly common situation for which no formal methods currently exis
t.\nJoint work with Oren Danieli\, Dan Zeltzer (TAU Econ)\, Itai Walk\, an
d Bar Weinstein (TAU Stat)\n\nZoom Link\n\nhttps://technion.zoom.us/j/9495
0420992 \;
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:190@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220515T123000
DTEND;TZID=Asia/Jerusalem:20220515T133000
DTSTAMP:20220329T074031Z
URL:https://dds.technion.ac.il/iemevents/addressing-solution-quality-in-da
ta-generated-optimization-models/
SUMMARY:Addressing Solution Quality in Data-generated Optimization Models [
\n Computational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Segev Wasserkrug and Orit Davidovich\n Advisors: \n Where:
Bloomfield 527 and ZOOM From:\nIBM \nAbstract: Mathematical optimization
models can improve decision making in a wide variety of real-world applica
tions. However\, in many cases\, it is difficult or impossible to model so
me of the constraints or objectives for such problems. Examples include co
mplex physical and chemical processes in industrial manufacturing plants\,
as well as quantities which are difficult to formally define such as food
palatability. One way to overcome this is to use the technique known as c
onstraint learning\, which utilizes historical data to learn the relevant
parts of the optimization model. However\, such constraint learning may re
sult in inaccuracies very different from the ones in traditional machine l
earning settings\, and which pose significant challenges in learning good
optimization models.\nIn this work\, we present a formal approach for addr
essing this gap\, to improve the quality of the solutions produced by the
learnt models. Our approach consists of: a) a formal definition of the mea
sure of quality of the generated model\; b) a Gaussian Process approach fo
r estimating model\; and c) methods to augment the generated optimization
model with additional constraints so as to obtain high quality (as defined
by our measure) optimization models. The talk will include detailed theor
etical analysis of our framework\, as well as empirical analysis.\n\nZoom
Link\n\nhttps://technion.zoom.us/j/94950420992\n\n \;
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:187@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220522T123000
DTEND;TZID=Asia/Jerusalem:20220522T133000
DTSTAMP:20220328T071132Z
URL:https://dds.technion.ac.il/iemevents/over-and-under-estimation-of-vacc
ine-effectiveness/
SUMMARY:Over- and under-estimation of vaccine effectiveness [ \n Comput
ational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Hilla De-Leon\n Advisors: \n Where: Bloomfield 527 and ZOO
M From:\nTechnion\nAbstract:\n\nSARS-CoV-2 vaccines provide high protectio
n against infection to the vaccinated individual and provide indirect prot
ection to its surroundings by blocking further transmission. Divergent res
ults have been reported on the effectiveness of the SARS-CoV-2vaccines. He
re\, we argue that this divergence is since the analyses did not consider
indirect protection. Using a novel heterogeneous infection model and real-
world data\, we demonstrate that heterogeneous vaccination rates among fam
ilies and communities\, both spatially and temporally\, and the study desi
gn that is used may significantly skew the vaccine effectiveness estimatio
ns. We show that estimations of a vaccine with 85% effectiveness will vary
between marked underestimation of ∼70% and overestimation of ∼95% dep
ending on the number of interactions between vaccinated and unvaccinated i
ndividuals.\n\n \;\n\nZoom Link\n\nhttps://technion.zoom.us/j/94950420
992\n\n \;
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:215@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220524T113000
DTEND;TZID=Asia/Jerusalem:20220524T123000
DTSTAMP:20220522T115306Z
URL:https://dds.technion.ac.il/iemevents/universality-in-random-neural-net
works/
SUMMARY:Universality in Random Neural Networks [ \n Computational Data
Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Boris Hanin \n Advisors: \n Where: Meyer 861 From:\nPrince
ton\nAbstract: A neural network f(x\;\\theta) is a family of functions in
a variable x varying continuously in a parameter vector \\theta. An import
ant chapter in the theory of neural networks is the analysis of random neu
ral networks given by choosing \\theta at random. In this talk\, I will fo
cus on the simplest setting of fully connected networks for which the stru
cture of the function f(x\;\\theta) is described by two integer parameters
\, a width n and a depth L\, as well as a non-linear function \\sigma.\nWh
en \\sigma is the identity\, a random fully connected network is a product
of L iid random matrices of size n x n. For general \\sigma\, the correla
tion functions of f(x\;\\theta) are non-linear generalization of linear st
atistics for such matrix products. After giving some intuition for random
neural networks\, I will state a range of new results regarding a novel ki
nd of universality for their correlation functions. This notion of univers
ality differs from the typical use of this term in random matrix theory. I
will also state several conjectures about scaling limits of random neural
networks as n\,L tend to infinity at a fixed ratio. These conjectures are
open even for random matrix products.
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:219@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220529T123000
DTEND;TZID=Asia/Jerusalem:20220529T133000
DTSTAMP:20220526T092500Z
URL:https://dds.technion.ac.il/iemevents/modeling-decentralized-group-coor
dination-at-large-scale/
SUMMARY:Modeling Decentralized Group Coordination at Large Scale [ \n C
omputational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Dr. Oren Tsur\n Advisors: \n Where: ZOOM From:\nBen Gurion
University \nAbstract:\n\nUnderstanding collective decision making at a l
arge-scale\, and elucidating how community organization and community dyna
mics shape collective behavior are at the heart of social science research
. Communities are multi-faceted\, complex and dynamic. In this talk I will
talk about two approaches for learning community representations: a gener
ic representation that could be used as an exploratory tool to find nuance
d similarities between communities\, and a task oriented representation. B
oth representations combine multiple types of signals - textual and contex
tual\, e.g.\, the (social) network structure and community dynamics. I wil
l show how this multi-faceted model can accurately predict large-scale col
lective decision-making in a distributed environment. We demonstrate the a
pplicability of our model through Reddit's r/place - a large-scale online
experiment in which millions of users\, self-organized in thousands of com
munities\, clashed and collaborated in an effort to realize their agenda.\
n\n \;\n\nZoom Link\n\n \;\n\nhttps://technion.zoom.us/j/949504209
92
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:251@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220613T123000
DTEND;TZID=Asia/Jerusalem:20220613T133000
DTSTAMP:20220809T114743Z
URL:https://dds.technion.ac.il/iemevents/the-effect-of-external-and-intern
al-communication-on-error-time-series/
SUMMARY:The effect of external and internal communication on error time ser
ies [ \n Computational Data Science (CDS) Seminar\n Seminars\n \
n ]
DESCRIPTION:By: M.Sc. Sharon Dringer\n Advisors: Prof. Eitan Naveh\n Where:
ZOOM From:\nTechnion\nAbstract:\n\nSometimes errors occur in organization
s\, particularly in medical organizations like hospitals. High levels of c
ommunication between the hospital’s staff significantly reduce the numbe
r of errors in the hospital.\n\nThe purpose of this seminar is to explain
how the level of communication between hospital staff affects the relation
ship between errors made in the past and in the future. Using sensor techn
ology over a period of one year\, we were able to locate individuals\, nam
ely hospital staff\, in real-time. According to our definition\, communica
tors are two individuals who are present at the same place for a period of
time. The communication level is defined as the number of meetings of the
communicators.\n\nThe database that we used is based on data collected ov
er a span of one year by 1000 sensors. The sensors update every three seco
nds in 6 oncology infusion units in one ambulatory hospital. The results
based on the integration between the meetings of the communicators\, hosp
ital maps\, data set of patients’ scheduled appointments\, and data set
of errors.\n\nBy integrating the four data sets and applying regression mo
dels\, we were able to demonstrate that higher levels of communication dir
ectly decrease the probability of future errors.\n\n \;
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:199@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220619T123000
DTEND;TZID=Asia/Jerusalem:20220619T133000
DTSTAMP:20220424T063407Z
URL:https://dds.technion.ac.il/iemevents/auctions-between-regret-minimizin
g-agents/
SUMMARY:Auctions between Regret-Minimizing Agents [ \n Computational Da
ta Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Yoav Kolumbus\n Advisors: \n Where: Bloomfield 527 and ZOO
M From:\nThe Hebrew University \nAbstract: The talk will focus on the pape
r "Auctions between Regret-Minimizing Agents" (WWW '22\, see the abstract
below) but will also touch on broader topics on learning dynamics and on t
he strategic consideration of the users of learning agents\, which extend
beyond auction games.\n\nWe analyze a scenario in which software agents im
plemented as regret-minimizing algorithms engage in a repeated auction on
behalf of their users. We study first-price and second-price auctions\, as
well as their generalized versions (e.g.\, as those used for ad auctions)
. Using both theoretical analysis and simulations\, we show that\, surpris
ingly\, in second-price auctions the players have incentives to misreport
their true valuations to their own learning agents\, while in the first-pr
ice auction it is a dominant strategy for all players to truthfully report
their valuations to their agents.\n\nSee the paper on arXiv: https://arxi
v.org/abs/2110.11855 as well as a related companion paper: https://arxiv.o
rg/abs/2112.07640.\n\nZoom Link\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:185@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220626T123000
DTEND;TZID=Asia/Jerusalem:20220626T133000
DTSTAMP:20220322T102714Z
URL:https://dds.technion.ac.il/iemevents/semi-definite-programming-meets-s
tance-classification-how-to-turn-theory-into-good-practice/
SUMMARY:Semi-Definite Programming meets Stance Classification - how to turn
theory into good practice. [ \n Computational Data Science (CDS) Semi
nar\n Seminars\n \n ]
DESCRIPTION:By: Dan Vilenchik\n Advisors: \n Where: Bloomfield 527 and ZOO
M From:\nBen Gurion University\nAbstract: Stance detection is an important
task\, supporting many downstream tasks such as discourse parsing and mod
eling the propagation of fake news\, rumors\, and science denial. In this
talk we describe a novel framework for stance detection. Our framework is
unsupervised and domain-independent. Given a claim and a multi-participant
discussion - we construct the interaction network from which we derive to
pological embeddings for each speaker. These speaker embeddings enjoy the
following property: speakers with the same stance tend to be represented b
y similar vectors\, while antipodal vectors represent speakers with opposi
ng stances. These embeddings are then used to divide the speakers into sta
nce-partitions. Our embedding is derived from the Semi-Definite Programmin
g (SDP) solution to the max-cut problem on the interaction network. In thi
s talk we shall explain how the success of our method is rooted in theoret
ical results of SDP integrality for random k-colorable graphs.\nWe evaluat
ed our method on three different datasets from different platforms. Our me
thod outperforms or is comparable with supervised models\, including Neura
l Network based approaches.\nZoom Link\n\nhttps://technion.zoom.us/j/94950
420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:240@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220707T123000
DTEND;TZID=Asia/Jerusalem:20220707T133000
DTSTAMP:20220809T104344Z
URL:https://dds.technion.ac.il/iemevents/adversarially-robust-conformal-pr
ediction/
SUMMARY:Adversarially Robust Conformal Prediction [ \n Computational Da
ta Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Asaf Gendler \n Advisors: \n Where: Mayer 1061 and ZOOM F
rom:\nTechnion\nAbstract:\n\nConformal prediction is a model-agnostic tool
for constructing prediction sets that are valid under the common i.i.d. a
ssumption\, which has been applied to quantify the prediction uncertainty
of deep net classifiers. In this talk\, we generalize this framework to th
e case where adversaries exist during inference time\, under which the i.i
.d. assumption is grossly violated. By combining conformal prediction with
randomized smoothing\, our proposed method forms a prediction set with fi
nite-sample coverage guarantee that holds for any data distribution with
ℓ2-norm bounded adversarial noise\, generated by any adversarial attack
algorithm. The core idea is to bound the Lipschitz constant of the non-con
formity score by smoothing it with Gaussian noise and leverage this knowle
dge to account for the effect of the unknown adversarial perturbation. We
demonstrate the necessity of our method in the adversarial setting and the
validity of our theoretical guarantee on three widely used benchmark data
sets: CIFAR10\, CIFAR100\, and ImageNet.\n\n* M.Sc. student under the sup
ervision of Professor Yaniv Romano.\n\nArticle link\n\nZoom Link\n\nhttps:
//technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:253@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220725T110000
DTEND;TZID=Asia/Jerusalem:20220725T123000
DTSTAMP:20220809T120304Z
URL:https://dds.technion.ac.il/iemevents/causal-machine-learning-a-necessa
ry-ingredient-for-building-generalizable-models-plus-an-introduction-to-do
why-causality-library/
SUMMARY:Causal Machine Learning: A necessary ingredient for building genera
lizable models (plus an introduction to DoWhy causality library) [ \n
Computational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Amit Sharma\n Advisors: \n Where: ZOOM From:\nMicrosoft \n
Abstract:\nIn this two-part talk\, I will describe:\n\n a) how causality
can help solve the fundamental challenge of out-of-distribution generaliza
tion for machine learning\;\nb) DoWhy library\, an open-source implementat
ion of these ideas.\n\nState-of-the-art machine learning models fail to ge
neralize as the distribution of data shifts from the training set\, even u
nder small shifts such as rotation (images) or changing semantically-equiv
alent words (text). These failures are often because models tend to learn
spurious correlations from the train data that break when new data is pres
ented. While many solutions based on regularization or data augmentation h
ave been proposed\, recent empirical studies show that none of them work r
eliably across datasets. The reason is that these approaches do not consid
er the causal structure of the underlying data-generating process that gov
erns how distribution shifts happen. I will present a new framework for bu
ilding generalizable ML models that injects known causal knowledge direct
ly into the training of neural networks. It does so by characterizing the
different kinds of distribution shifts using a causal graph and automatic
ally inferring the correct regularization to apply. As a corollary\, we th
eoretically show that no single regularization can work for all kinds of
shifts\, explaining the weak empirical evidence for prior work. Using th
e causal knowledge obtains 10-30% improvement over state-of-the-art models
on benchmark datasets.\n\nMoving away from benchmark datasets\, however\,
there are many challenges in implementing causal algorithms. The biggest
challenge is how to come up with the right causal assumptions for a given
problem. In this direction\, I will describe DoWhy\, an open-source librar
y that seeks to address this problem by providing a convenient interface f
or expressing and validating assumptions\, to the extent possible. In par
ticular\, DoWhy implements multiple robustness checks to test causal assum
ptions based on the idea of negative controls\, including placebo treatm
ent and unobserved confounding tests. While DoWhy started out as an effect
inference library\, it now supports multiple causal tasks such as attribu
tion and prediction. For each task\, DoWhy abstracts out the causal anal
ysis as a simple four-step API: 1) modeling a causal graph using structura
l assumptions\, 2) identifying whether the desired quantity is estimable u
nder the causal graph\, 3) estimating the quantity using statistical estim
ators\, and finally 4) refuting the obtained estimate through robustness c
hecks and sensitivity analyses. I will present case-studies on using the l
ibrary for different causal tasks. Contributions and feedback are welcome
at https://github.com/py-why/dowhy.\n\nBio:\n\nAmit Sharma is a Princip
al Researcher at Microsoft Research India. His work bridges causal inferen
ce techniques with machine learning\, with the goal of making machine lear
ning models generalize better\, be explainable and avoid hidden biases. To
this end\, Amit has co-led the development of the open-source DoWhy libra
ry for causal inference and DiCE library for counterfactual explanations.
The broader theme in his work is how machine learning can be used for bett
er decision-making\, especially in sensitive domains. In this direction\,
Amit collaborates with NIMHANS on mental health technology\, including a r
ecent app\, MindNotes\, that encourages people to break stigma and reach o
ut to professionals. His work has received many awards including a Best Pa
per Award at ACM CHI 2021 conference\, Best Paper Honorable Mention at ACM
CSCW 2016 conference\, 2012 Yahoo! Key Scientifihttps: Challenges Award a
nd the 2009 Honda Young Engineer and Scientist Award. Amit received his Ph
.D. in computer science from Cornell University and B.Tech. in Computer Sc
ience and Engineering from Indian Institute of Technology (IIT) Kharagpur.
\n\nhttps://www.microsoft.com/en-us/research/people/amshar/\n\n \;\n\n
Zoom Link\n\nhttps://technion.zoom.us/j/3038131699
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:247@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220814T123000
DTEND;TZID=Asia/Jerusalem:20220814T133000
DTSTAMP:20220809T110211Z
URL:https://dds.technion.ac.il/iemevents/structured-learning-for-low-resou
rce-natural-language-processing/
SUMMARY:Structured Learning for Low-resource Natural Language Processing [
\n Computational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: PhD Guy Rotman\n Advisors: Prof. Roi Reichart\n Where: ZOOM
From:\nTechnion\nAbstract: \n\nNatural language processing (NLP) is a mac
hine learning field that operates over text in its various forms\, written
in one of the many existing human languages. Textual human writings can b
e found in a large number of different domains\, including news articles\,
medical records\, economic reports\, and scientific papers\, to name a fe
w.\nModern NLP models applied to such texts require a significant amount o
f labeled data to obtain satisfactory results on downstream tasks. Nonethe
less\, scarcity is common\, especially for languages with fewer native spe
akers or domains outside the mainstream. In such low-resource cases\, it i
s essential to construct algorithms that can use external resources\, such
as unlabeled data from the same or from a different domain\, external dat
abases (e.g. Wikipedia)\, and domain expertise of human experts.\nIn this
dissertation\, we aim to investigate and develop innovative structured alg
orithms for NLP tasks under low-resource conditions. As part of our experi
ments\, we engage in challenging setups present in low-resource NLP. These
include unsupervised domain adaptation for cross-domain and low-resource
languages\, model compression under domain shift\, and active learning tha
t operates over pairs of tasks with unique task relations. With such chall
enging setups\, we demonstrate a substantial increase in performance by im
proving common low-resource techniques\, such as self-training\, structure
d model compression\, and confidence-based active learning.\n\n \;\n\n
Zoom Link\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:250@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220816T140000
DTEND;TZID=Asia/Jerusalem:20220816T144000
DTSTAMP:20220809T114605Z
URL:https://dds.technion.ac.il/iemevents/%d7%9brank-wolfe-based-algorithms
/
SUMMARY:Frank-Wolfe-based Algorithms for Approximating Tyler’s M-estimato
r [ \n Computational Data Science (CDS) Seminar\n Graduate Student
Seminar\n Seminars\n \n ]
DESCRIPTION:By: M.Sc. Lior Danon\n Advisors: Assis. Prof. Dan Graber\n Wher
e: Bloomfield 424 From:\nTechnion\nTyler's M-estimator is a well known pro
cedure for robust and heavy-tailed covariance estimation. Tyler himself su
ggested an iterative fixed-point algorithm for computing his estimator h
owever\, it requires super-linear (in the size of the data) runtime per it
eration\, which maybe prohibitive in large scale. In this work we propose\
, to the best of our knowledge\, the first Frank-Wolfe-based algorithms fo
r computing Tyler's estimator. One variant uses standard Frank-Wolfe steps
\, the second also considers away-steps (AFW)\, and the third is a \\texti
t{geodesic} version of AFW (GAFW). AFW provably requires\, up to a log fac
tor\, only linear time per iteration\, while GAFW runs in linear time (up
to a log factor) in a large n (number of data-points) regime. All three
variants are shown to provably converge to the optimal solution with subli
near rate\, under standard assumptions\, despite the fact that the underly
ing optimization problem is not convex nor smooth. Under an additional fai
rly mild assumption\, that holds with probability 1 when the (normalized)
data-points are i.i.d. samples from a continuous distribution supported on
the entire unit sphere\, AFW and GAFW are proved to converge with linear
rates. Importantly\, all three variants are parameter-free and use adapt
ive step-sizes.
CATEGORIES:Computational Data Science (CDS) Seminar,Graduate Student
Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:252@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220816T145000
DTEND;TZID=Asia/Jerusalem:20220816T153000
DTSTAMP:20220809T114539Z
URL:https://dds.technion.ac.il/iemevents/local-linear-convergence-of-gradi
ent-methods-for-subspace-optimization-via-strict-complementarity/
SUMMARY:Local Linear Convergence of Gradient Methods for Subspace Optimiza
tion via Strict Complementarity [ \n Computational Data Science (CDS)
Seminar\n Graduate Student Seminar\n Seminars\n \n ]
DESCRIPTION:By: M.Sc. Ron Fisher\n Advisors: Assis. Prof. Dan Graber\n Wher
e: Bloomfield 424 From:\nTechnion\nAbstract: We consider optimization prob
lems in which the goal is find a k-dimensionalsubspace of the reals n-tupl
e space such that k<\;<\;n \, which minimizes a convex and smooth loss
. Such problems generalize the fundamental task of principal component ana
lysis (PCA) to include robust and sparse ounterparts\, and logistic PCA fo
r binary data\, among others. While this problem is not convex it admits n
atural algorithms with very efficient iterations and memory requirements\,
which is highly desired in high-dimensional regimes however\, arguing abo
ut their fast convergence to a global optimal solution is difficult. On th
e other hand\, there exists a simple convex relaxation for which converge
nce to the global optimum is straightforward\, however corresponding algor
ithms are not efficient when the dimension is very large. In this work we
present a natural deterministic sufficient condition so that the optimal s
olution to the convex relaxation is unique and is also the optimal solutio
n to the original nonconvex problem. Mainly\, we prove that under this con
dition\, a natural highly-efficient nonconvex gradient method\, which we r
efer to as “gradient orthogonal iteration” \, when initialized with a
``warm-start'\;'\;\, converges linearly for the nonconvex problem. W
e also establish similar results for the nonconvex projected gradient meth
od\, and the Frank-Wolfe method when applied to the convex relaxation. We
conclude with empirical evidence on synthetic data which demonstrate the a
ppeal of our approach.
CATEGORIES:Computational Data Science (CDS) Seminar,Graduate Student
Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:249@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220816T154000
DTEND;TZID=Asia/Jerusalem:20220816T162000
DTSTAMP:20220809T114240Z
URL:https://dds.technion.ac.il/iemevents/weak_oracle_based_augmented_lagra
ngian_method/
SUMMARY:Weak Oracle Based Augmented Lagrangian Method For Composite Optimiz
ation [ \n Computational Data Science (CDS) Seminar\n Graduate Stu
dent Seminar\n Seminars\n \n ]
DESCRIPTION:By: M.Sc. Tsur Livney\n Advisors: Assis. Prof. Dan Graber and A
ssoc. Prof. Shoham Sabach\n Where: Bloomfield 424 From:\nTechnion\nAbstrac
t : This paper considers a convex composite optimization problem with aff
ine constraints\, which includes problems of minimization over an in- ters
ection of convex sets. We propose an augmented Lagrangian based method\, i
n which we perform primal updates using a Weak Proximal Or- acle (WPO). Th
e WPO is an oracle more powerful than the standard linear minimization ora
cle (lmo) used in conditional gradient based meth- ods\, yet computational
ly feasible for large scale problems in interesting and important domains
such as polytopes and trace norm ball\, where the optimal solution is of l
ow rank\, in contrast to the standard quadratic min- imization oracle used
in proximal methods. For polytopes\, we show an implementation of such or
acle that requires one call for an lmo. For trace norm regularization\, as
suming the optimal solution is of low rank k\, we show that such oracle ca
n be implemented in roughly k times the com- plexity of an lmo. We also sh
ow an extension of the latter for tensors of low rank. Under an assumption
of primal quadratic gap\, we achieve convergence rate of O(1/N) on both t
he objective residual and the feasibily gap.
CATEGORIES:Computational Data Science (CDS) Seminar,Graduate Student
Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:255@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220911T110000
DTEND;TZID=Asia/Jerusalem:20220911T120000
DTSTAMP:20220831T091746Z
URL:https://dds.technion.ac.il/iemevents/domain-adaptation-in-nlp-towards-
adaptation-to-any-domain/
SUMMARY:Domain Adaptation in NLP - Towards Adaptation to Any Domain [ \n
Computational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: PhD Eyal Ben-David \n Advisors: Professor Roi Reichart.\n
Where: Zoom From:\nTechnion\nAbstract:\n\nNatural language processing (NLP
) algorithms have reached unprecedented milestones in the last few years\,
primarily due to the introduction of sizeable pre-trained language models
. While decent results can be achieved even in zero-shot setups (i.e.\, wh
en the model is not exposed to labeled data from the task of interest)\, s
olid results still depend on fine-tuning with labeled task data from the t
est distribution (a.k.a\, the target domain distribution). Yet\, since tes
t data may emanate from many linguistics domains (each with unique distrib
utional properties)\, NLP algorithms are likely to perform under an out-of
-distribution (OOD) scenario. As generalization beyond the training distri
bution is still a fundamental challenge\, NLP algorithms suffer a signific
ant degradation when applied to OOD examples.\n\nThis seminar addresses th
ese shortcomings\, primarily focusing on domain adaptation definitions\, w
hich naturally tackle the OOD challenge. We present two complementary effo
rts: The evolution of classic domain adaptation methods in the large pretr
ained language models era\; and the development of new fundamental approac
hes to performing domain adaptation.\n\nZoom Link\n\nhttps://technion.zoom
.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:260@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220922T103000
DTEND;TZID=Asia/Jerusalem:20220922T113000
DTSTAMP:20220915T074824Z
URL:https://dds.technion.ac.il/iemevents/learning-generalized-gumbel-max-c
ausal-mechanisms/
SUMMARY:Learning Generalized Gumbel-max Causal Mechanisms [ \n Computat
ional Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: PhD Guy Lorberbom\n Advisors: Prof. Tamir Hazan\n Where: ZO
OM From:\nTechnion\nAbstract:\n\nTo perform counterfactual reasoning in St
ructural Causal Models (SCMs)\, one needs to know the causal mechanisms\,
which provide factorizations of conditional distributions into noise sourc
es and deterministic functions mapping realizations of noise to samples. U
nfortunately\, the causal mechanism is not uniquely identified by data tha
t can be gathered by observing and interacting with the world\, so there r
emains the question of how to choose causal mechanisms.\nIn recent work\,
Oberst &\; Sontag (2019) propose Gumbel-max SCMs\, which use Gumbel-max
reparameterizations as the causal mechanism due to an intuitively appeali
ng counterfactual stability property.\nIn this work\, we instead argue for
choosing a causal mechanism that is best under a quantitative criteria su
ch as minimizing variance when estimating counterfactual treatment effects
. We propose a parameterized family of causal mechanisms that generalize G
umbel-max. We show that they can be trained to minimize counterfactual eff
ect variance and other losses on a distribution of queries of interest\, y
ielding lower variance estimates of counterfactual treatment effect than f
ixed alternatives\, also generalizing to queries not seen at training time
.\n\nZoom Link\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:261@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20220922T113000
DTEND;TZID=Asia/Jerusalem:20220922T123000
DTSTAMP:20220915T075256Z
URL:https://dds.technion.ac.il/iemevents/on-the-importance-of-gradient-nor
m-in-pac-bayesian-bounds/
SUMMARY:On the Importance of Gradient Norm in PAC-Bayesian Bounds [ \n
Computational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: PhD Itai Gat\n Advisors: Prof. Tamir Hazan \n Where: ZOOM F
rom:\nTechnion\nAbstract:\n\nGeneralization bounds that assess the differe
nce between the true risk and the empirical risk have been studied extensi
vely. However\, to obtain bounds\, current techniques use strict assumptio
ns such as a uniformly bounded or a Lipschitz loss function. To avoid thes
e assumptions\, in this work\, we follow an alternative approach: we relax
uniform bounds assumptions by using on-average bounded loss and on-averag
e bounded gradient norm assumptions. Following this relaxation\, we propos
e a new generalization bound that exploits the contractivity of the log-So
bolev inequalities. These inequalities add an additional loss-gradient nor
m term to the generalization bound\, which is intuitively a surrogate of t
he model complexity. We apply the proposed bound on Bayesian deep nets and
empirically analyze the effect of this new loss-gradient norm term on dif
ferent neural architectures.\n\nZoom Link\n\nhttps://technion.zoom.us/j/94
950420992\n\n \;
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:262@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20221027T133000
DTEND;TZID=Asia/Jerusalem:20221027T144500
DTSTAMP:20221023T122121Z
URL:https://dds.technion.ac.il/iemevents/improve-agents-without-retraining
-parallel-tree-search-with-off-policy-correction/
SUMMARY:Improve Agents without Retraining: Parallel Tree Search with Off-Po
licy Correction [ \n Computational Data Science (CDS) Seminar\n Se
minars\n \n ]
DESCRIPTION:By: Gal Dalal\n Advisors: \n Where: Cooper 112 From:\nNVIDIA
Research\nAbstract:\n\nTree Search (TS) is crucial to some of the most inf
luential successes in reinforcement learning. In this work\, we tackle two
major challenges with TS that limit its usability: distribution shift and
scalability. We first discover and analyze a counter-intuitive phenomenon
: action selection through TS and a pre-trained value function often leads
to lower performance compared to the original pre-trained agent\, even wh
en having access to the exact state and reward in future steps. We show th
is is due to a distribution shift to areas where value estimates are highl
y inaccurate and analyze this effect using Extreme Value theory. To overco
me this problem\, we introduce a novel off-policy correction term. We prov
e that this term eliminates the above mismatch and bound the probability o
f sub-optimal action selection. Our correction significantly improves pre-
trained Rainbow agents without any further training\, often more than doub
ling their scores on Atari games. Next\, we address the scalability issue
given by the computational complexity of exhaustive TS that scales exponen
tially with the tree depth. We introduce Batch-BFS: a GPU breadth-first se
arch that advances all nodes in each depth of the tree simultaneously. Usi
ng Batch-BFS\, we train DQN agents from scratch using TS and show improvem
ent in several Atari games compared to both the original DQN and the more
advanced Rainbow.
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:264@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20221103T133000
DTEND;TZID=Asia/Jerusalem:20221103T143000
DTSTAMP:20221030T082826Z
URL:https://dds.technion.ac.il/iemevents/on-the-limitations-of-dataset-bal
ancing-the-lost-battle-against-spurious-correlations/
SUMMARY:On the Limitations of Dataset Balancing: The Lost Battle Against Sp
urious Correlations [ \n Computational Data Science (CDS) Seminar\n
Seminars\n \n ]
DESCRIPTION:By: Roy Schwartz\n Advisors: \n Where: ZOOM From:\nHebrew Univ
ersity of Jerusalem \nAbstract:\n\nRecent work has shown that deep learnin
g models in NLP are highly sensitive to low-level correlations between sim
ple features and specific output labels\, leading to overfitting and lack
of generalization. To mitigate this problem\, a common practice is to bala
nce datasets by adding new instances or by filtering out "easy'' instances
\, culminating in a recent proposal to eliminate single-word correlations
altogether. In this talk\, I will identify that despite these efforts\, in
creasingly-powerful models keep exploiting ever-smaller spurious correlati
ons\, and as a result even balancing all single-word features is insuffici
ent for mitigating all of these correlations. In parallel\, a truly balanc
ed dataset may be bound to "throw the baby out with the bathwater'' and mi
ss important signals encoding common sense and world knowledge. I will hig
hlight several alternatives to dataset balancing\, focusing on enhancing d
atasets with richer contexts\, allowing models to abstain and interact wit
h users\, and turning from large-scale fine-tuning to zero- or few-shot se
tups.\n\n \;\n\nThis is joint work with Gabriel Stanovsky.\n\n \;\
n\nZoom Link\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:269@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20221110T133000
DTEND;TZID=Asia/Jerusalem:20221110T143000
DTSTAMP:20221106T071202Z
URL:https://dds.technion.ac.il/iemevents/situation-aware-explainability-in
-business-processes/
SUMMARY:Situation-Aware eXplainability in business processes [ \n Compu
tational Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Fabiana Fournier and Lior Limonad\n Advisors: \n Where: Co
oper 112 and ZOOM From:\nIBM\nAbstract:\n\nAugmented Business Process Mana
gement Systems (ABPMSs) are the next generation of business processes syst
ems. An ABPMSs is an AI-empowered\, trustworthy\, and process-aware inform
ation system that reasons and acts upon data within a set of constraints a
nd assumptions\, with the aim to continuously adapt and improve a set of b
usiness processes with respect to one or more performance indicators. One
of the main characteristics of ABPMSs is their ability to reason and expla
in process outcomes. However\, state-of-the-art frameworks for explainabil
ity fail to include the richness of contextual situations that cause proce
ss outcomes necessary to produce adequate explanations. In this talk\, w
e will present the main characteristics and challenges of ABPMSs and focus
on recent developments around Situation-Aware eXplainability (SAX). SAX i
s an advanced form of sense making in ABPMs. Fundamentally\, it enables re
asoning and inferring adequate assertions that tie between process executi
on choices and their preceding premises (e.g.\, assumptions\, beliefs\, in
tentions\, and earlier conditions) while considering the rich circumstanti
al/contextual environment in which such executions occurred.\n\nZoom Link\
n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:265@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20221117T133000
DTEND;TZID=Asia/Jerusalem:20221117T143000
DTSTAMP:20221102T111358Z
URL:https://dds.technion.ac.il/iemevents/transformer-explainability-beyond
-accountability/
SUMMARY:Transformer Explainability beyond Accountability [ \n Computati
onal Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: Hila Chefer\n Advisors: \n Where: Cooper 112 From:\nTel Av
iv University\nAbstract:\n\nTransformers have revolutionized deep learning
research across many disciplines\, starting from NLP and expanding to vis
ion\, speech\, and more. In my talk\, I will explore several milestones to
ward interpreting all families of Transformers\, including unimodal\, bi-m
odal\, and encoder-decoder Transformers. I will present working examples a
nd results that cover some of the most prominent models\, including CLIP\,
ViT\, and LXMERT.\n\nI will then present our recent explainability-driven
fine-tuning technique that significantly improves the robustness of Visio
n Transformers (ViTs). The loss we employ ensures that the model bases its
prediction on the relevant parts of the input rather than supportive cues
(e.g.\, background).\n\n \;
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:279@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20221124T133000
DTEND;TZID=Asia/Jerusalem:20221124T143000
DTSTAMP:20221122T092647Z
URL:https://dds.technion.ac.il/iemevents/theoretical-and-practical-princip
les-for-designing-training-and-deploying-huge-language-models/
SUMMARY:Theoretical and practical principles for designing\, training\, and
deploying huge language models [ \n Computational Data Science (CDS)
Seminar\n Seminars\n \n ]
DESCRIPTION:By: Yoav Levine\n Advisors: \n Where: Bloomfield 527 and Zoom
From:\nAI21 Labs\nAbstract:\n\nThe field of natural language processing (N
LP) has been advancing in giant strides over the past several years. The m
ain drivers of this success are: (1) scaling the Transformer deep network
architecture to unprecedented sizes and (2) “pretraining” the Transfor
mer over massive amounts of unlabeled text. In this talk\, I will describe
efforts to provide principled guidance for the above main components and
further thrusts in contemporary NLP\, aimed to serve as timely constructiv
e feedback for the strong empirical pull in this field.\n\n \;\n\nI wi
ll begin by describing our theoretical framework for analyzing Transformer
s\, and present results on the depth to width tradeoffs in Transformers\,
identified bottlenecks within internal Transformer dimensions\, and identi
fied biases introduced during the Transformer self-supervised pretraining
phase. This framework has guided the design and scale of several of the la
rgest existing language models\, including Chinchilla by Deepmind (70 bill
ion learned parameters)\, Bloom by BigScience (176 billion learned paramet
ers)\, and Jurassic-1 by AI21(178 billion learned parameters). Then\, I wi
ll describe our works on leveraging linguistic biases such as word senses
or frequent n-grams in order to increase efficiency of the self-supervised
pretraining phase. Subsequently\, I will describe novel principles for ad
dressing a present-day problem stemming from the above success of scaling\
, namely\, how to deploy a huge language model such that it specializes in
many different use cases simultaneously (e.g.\, supporting many different
customer needs simultaneously). Finally\, I will comment on future challe
nges in this field\, and will relatedly present a recent theoretical resul
t on the importance of intermediate supervision when solving composite lin
guistic tasks.\n\n \;\n\nThis talk is based on works published in Neur
IPS 2020\, ACL 2020\, ICLR 2021 (spotlight)\, ICML 2021\, ICLR 2022 (spotl
ight)\, ICML 2022 (workshop)\, as well as on several recent preprints.\n\n
\;\n\nZOOM LINK\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:297@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20221215T133000
DTEND;TZID=Asia/Jerusalem:20221215T143000
DTSTAMP:20221212T111956Z
URL:https://dds.technion.ac.il/iemevents/from-theory-to-practice-and-back-
stance-detection-as-a-case-study/
SUMMARY:From theory to practice and back – Stance Detection as a case stu
dy [ \n Computational Data Science (CDS) Seminar\n Seminars\n \n
]
DESCRIPTION:By: Dan Vilenchik\n Advisors: \n Where: Bloomfield 527 and Zoo
m From:\nBen Gurion University\nAbstract:\n\nStance detection is an import
ant task\, supporting many downstream tasks such as discourse parsing and
modeling the propagation of fake news\, rumors\, and science denial. In th
is talk we describe a novel framework for stance detection. Our framework
is unsupervised and domain-independent. Given a claim and a multi-particip
ant discussion - we construct the interaction network from which we derive
topological embeddings for each speaker. These speaker embeddings enjoy t
he following property: speakers with the same stance tend to be represente
d by similar vectors\, while antipodal vectors represent speakers with opp
osing stances. These embeddings are then used to divide the speakers into
stance-partitions. Our embedding is derived from the Semi-Definite Program
ming (SDP) solution to the max-cut problem on the interaction network. In
this talk we shall explain how the success of our method is rooted in theo
retical results in random graph theory.\n\nWe evaluated our method on thre
e different datasets from different platforms. Our method outperforms or i
s comparable with supervised models\, including Neural Network based appro
aches.\n\n \;\n\nThe talk is based on a joint paper with Ron Korenblum
Pick\, Vladyslav Kozhukhov (students) and Oren Tsur (PI).\n\nPaper appear
ed in AAAI 2022.\n\nhttps://arxiv.org/abs/2112.00712\n\n \;\n\nZoom Li
nk\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:306@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20230105T133000
DTEND;TZID=Asia/Jerusalem:20230105T143000
DTSTAMP:20230101T092458Z
URL:https://dds.technion.ac.il/iemevents/generative-speech-and-audio-model
ing-using-neural-discrete-representations/
SUMMARY:Generative Speech and Audio Modeling using Neural Discrete Represen
tations [ \n Computational Data Science (CDS) Seminar\n Seminars\n
\n ]
DESCRIPTION:By: Yossi Adi\n Advisors: \n Where: Bloomfield 526 and Zoom Fr
om:\nHebrew University\nAbstract:\n\nSpeech and audio modeling is a challe
nging task due various reasons. First\, such signals are extremely long\,
the common sampling rate for speech signals is 16kHz\, while for music is
44.1kHz. Second\, unlike text\, speech and audio representations are conti
nuous in nature\, which makes modeling and sampling difficult.\n\nIn this
talk\, I will present an alternative view for generative speech and audio
modeling in which we use a discrete representation obtained by neural nets
trained in an unsupervised fashion. Equipped with this representation we
can adapt recent powerful natural language processing models and demonstra
te the ability to capture long-term dependencies in the signal as well as
high-quality generations. Specifically\, I will present our recent advance
ments in speech and audio language modeling involving different modalities
for model conditioning and various application this can be used for.\n\nZ
oom Link\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:328@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20230212T133000
DTEND;TZID=Asia/Jerusalem:20230212T143000
DTSTAMP:20230312T122700Z
URL:https://dds.technion.ac.il/iemevents/detecting-suicide-risk-from-visua
l-facebook-posts/
SUMMARY:Detecting Suicide Risk from Visual Facebook Posts [ \n Computat
ional Data Science (CDS) Seminar\n Seminars\n \n ]
DESCRIPTION:By: MSC Yael Badian\n Advisors: Roi Reichart\n Where: ZOOM Fro
m:\nTechnion\nAbstract:\nAlthough AI research for the development of suici
de prevention methods is promising\, it still has several major gaps\, inc
luding “black box” methodologies\, inadequate outcome measures and sca
rce research of non-verbal inputs\, such as images. In this talk I will pr
esent our novel method to predict valid suicide risk from social media ima
ges. Contrary to most deep learning-based methods\, our suggested method b
enefits from the advantages of neural models\, but is also highly interpre
table. Under the assumption that certain images may contain visual cues wh
ich are correlated with suicide risk (such as color\, brightness\, and fac
ial expressions)\, we manually design a set of text queries which describe
these characteristics. Then\, we rely on OpenAI’s CLIP\, a large pre-tr
ained multi-modal model\, to extract features from the images using these
queries. The resulting features are then used to make suicide risk predict
ions in an interpretable manner. I will explain our method in detail\, des
cribe the challenges we encountered\, the data we used and share our resul
ts.\n\nZoom Link\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VEVENT
UID:326@dds.technion.ac.il
DTSTART;TZID=Asia/Jerusalem:20230312T133000
DTEND;TZID=Asia/Jerusalem:20230312T143000
DTSTAMP:20230228T083922Z
URL:https://dds.technion.ac.il/iemevents/beast-net-learning-novel-behavior
al-insights-using-a-neural-network-adaptation-of-a-behavioral-model/
SUMMARY:BEAST-Net: Learning novel behavioral insights using a neural networ
k adaptation of a behavioral model [ \n Computational Data Science (CD
S) Seminar\n Seminars\n \n ]
DESCRIPTION:By: M.Sc. Vered Shohan\n Advisors: Ori Plonsky \n Where: ZOOM
From:\nTechnion \nAbstract:\nIn this talk I will present a behavioral mode
l called BEAST-Net\, which combines the basic logic of BEAST\, a psycholog
ical theory-based behavioral model\, with machine learning (ML) techniques
. Our approach was to formalize BEAST mathematically as a differentiable f
unction and parameterize it with a neural network\, enabling us to learn t
he model parameters from data and optimize it using backpropagation. The r
esulting model\, BEAST-Net\, is able to scale to larger datasets and adapt
to new data with greater ease\, while retaining the psychological insight
s and interpretability of the original model. We evaluate BEAST-Net on the
largest public benchmark dataset of human choice tasks and show that it o
ut performs several baselines\, including the original BEAST model. Furthe
rmore\, we demonstrate that our model can be used to provide interpretable
explanations for choice behavior\, allowing us to derive new psychologica
l insights from the data. Our work made a significant contribution to the
field of human decision making by showing that ML techniques can be used t
o improve the scalability and adaptability of psychological theory based m
odels while preserving their interpretability and ability to provide insig
hts.\n\n \;\n\nZoom Link\n\nhttps://technion.zoom.us/j/94950420992
CATEGORIES:Computational Data Science (CDS) Seminar,Seminars
END:VEVENT
BEGIN:VTIMEZONE
TZID:Asia/Jerusalem
X-LIC-LOCATION:Asia/Jerusalem
BEGIN:STANDARD
DTSTART:20201025T010000
TZOFFSETFROM:+0300
TZOFFSETTO:+0200
TZNAME:IST
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20210326T030000
TZOFFSETFROM:+0200
TZOFFSETTO:+0300
TZNAME:IDT
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:20211031T010000
TZOFFSETFROM:+0300
TZOFFSETTO:+0200
TZNAME:IST
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20220325T030000
TZOFFSETFROM:+0200
TZOFFSETTO:+0300
TZNAME:IDT
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:20221030T010000
TZOFFSETFROM:+0300
TZOFFSETTO:+0200
TZNAME:IST
END:STANDARD
END:VTIMEZONE
END:VCALENDAR