After the recent incident, we have restored access to the website from outside the CERN network, however access from certain worldwide locations is still being blocked.

CERN Accelerating science

 
First step of the procedure: we evaluate the unknown function (orange) at a few points (black) and computed the confidence interval (light blue). The dark blue line represents the best estimate of the unknown function.
Max\_depth evolution: the value of max\_depth is plotted against the number of iteration of the bayesian optimization procedure. The black lines are the boundary of the hyperparameter. The green dot is the default value and the big orange one is the best value. The color axis shows the root mean square of the testing on the test sample. A red value means that the ML algorithm performed better on the test sample than a blue value.
Step 3, the GP estimate of the true function with the new information provided by the evaluation of $x_{next}$.
: scale=0.52
: scale=0.52 : Caption not extracted
max\_child\_weight evolution: the value of min\_child\_weight is plotted against the number of iteration of the bayesian optimization procedure. The black lines are the boundary of the hyperparameter. The green dot is the default value and the big orange one is the best value. The color axis shows the root mean square of the testing on the test sample. A red value means that the ML algorithm performed better on the test sample than a blue value.
: scale=0.52
: scale=0.52
: scale=0.52 : Caption not extracted
: scale=0.5
An overall view of particle accelerators at CERN.
: scale=0.52 : Caption not extracted
: scale=0.52 : Caption not extracted
Coordinate system at CMS
: scale=0.52
: scale=0.52 : Caption not extracted
LHC.png
: scale=0.52 : Caption not extracted
: scale=0.52
Box plot of the corrected energy with optimized hyperparameter as a function of $pt$. The green line in the middle is the median, the box are the first and third quartiles. The black dots represent the event's quantiles outside 1.5 times the interquartile range.
: scale=0.52
Box plot of the uncorrected energy as a function of $pt$ from 20 to 310. The green line in the middle is the median, the box are the first and third quartiles. The black dots represent the event's quantiles outside 1.5 times the interquartile range.
: scale=0.52 : Caption not extracted
Gamma evolution: the value of gamma is plotted against the number of iteration of the bayesian optimization procedure. The black lines are the boundary of the hyperparameter. The green dot is the default value and the big orange one is the best value. The color axis shows the root mean square of the testing on the test sample. A red value means that the ML algorithm performed better on the test sample than a blue value.
: scale=0.52 : Caption not extracted
Decision Tree example for boosting. The size of the letters represents the value of the weight assigned to this data point. Image taken from \cite{BDT}.
Box plot of the corrected energy with optimized hyperparameter as a function of $\eta$. The green line in the middle is the median, the box are the first and third quartiles. The black dots are the event's quantiles outside 1.5 times the interquartile range.
The median of the distribution is plotted for the three models. We observe that the median of the corrected energy are almost flat.
: scale=0.5
Colsample\_bytree evolution: the value of colsample\_bytree is plotted against the number of iteration of the bayesian optimization procedure. The black lines are the boundary of the hyperparameter. The green dot is the default value and the big orange one is the best value. The color axis shows the root mean square of the testing on the test sample. A red value means that the ML algorithm performed better on the test sample than a blue value.
: scale=0.5 : Distribution of the raw energy divided by the generated energy. We observe that the corrected energy (red with optimized hyperparameters and black with the default setting) peak is narrower and more centered than the uncorrected one (blue). The figure to the right is the same as the first figure but zoomed around one.
Difference of the third and first quartile in bins of $\eta$. The IQR for the corrected event is smaller for every bins of $\eta$ than the uncorrected event.
: scale=0.52 : Distribution of the raw energy (left) and the maximal energy over the energy of the 5x5 most energetic crystals. (right) in the training sample.
: scale=0.52
Example of a Decision Tree. The grey boxes are nodes where tests are done. The lines are the branches representing the possible choices. The orange boxes are the leafs: the results. The question posed in this diagram is which customer will buy a given product on function of his age and marital statue.
Composition of the different trees.
: scale=0.52 : Caption not extracted
: scale=0.5
: scale=0.5 : Comparison of the energy reconstruction for two different bins of $\eta$. We observed that for $\eta$ around 0 (left), the raw energy is already close to the generated energy. But for $\eta$ at the edge of the interval (right), the two energy differs significantly.
: scale=0.9
n\_estimator evolution: the value of n\_estimator is plotted against the number of iteration of the bayesian optimization procedure. The black lines are the boundary of the hyperparameter. The green dot is the default value and the big orange one is the best value. The color axis shows the root mean square of the testing on the test sample. A red value means that the ML algorithm performed better on the test sample than a blue value.
: scale=0.52
: scale=0.52 : Caption not extracted
Schematic view of a transverse slice of the central part of the CMS detector.
: scale=0.52 : Caption not extracted
Overfitting example
reg\_lambda evolution: the value of reg\_lambda is plotted against the number of iteration of the bayesian optimization procedure. The black lines are the boundary of the hyperparameter. The green dot is the default value and the big orange one is the best value. The color axis shows the root mean square of the testing on the test sample. A red value means that the ML algorithm performed better on the test sample than a blue value.
Classification example, image taken from \cite{BDT}
Histogram of the effective root-mean-square on the testing sample. We observe that there is a lot of event very close to the minimum.
: scale=0.52
Difference of the third with the first quartile. We observe that the IQR of the corrected event is smaller for almost every bin of $pt$, except at low $pt$.
Step 2, the acquisition function with respect to the hyperparameter and to the previous observations. The violet star is the point we have to evaluate next.
Effective root mean square (effrms) on the testing sample against the iteration.
Schematic view of the Barrel and the Endcap of the Tracker at the CMS detector.
: scale=0.52
: scale=0.52
: scale=0.52
The median of the distribution is plotted for the three models. We observe that the median of the corrected energy are almost flat.
: scale=0.52
Box plot of the uncorrected energy as a function of $\eta$. The green line in the middle is the median, the box are the first and third quartiles. The black dots represent the event's quantiles outside 1.5 times the interquartile range.
: scale=0.52 : Distribution of $\eta$ (left) and $pt$ (right) in the training sample.
: scale=0.52
: scale=0.52 : Caption not extracted
: scale=0.5 : Comparison of the energy reconstruction for two different bins of $pt$. We remark that for high $pt$ (left), the correction is larger. For low $pt$ (right), the correction is smaller but still significant, in particularly in the optimized case.