In this paper, we address several aspects of applying classical machine learning algorithms to a regression problem. We compare the predictive power to validate our approach on a data about revenue of a large Russian restaurant chain. We pay special attention to solve two problems: data heterogeneity and a high number of correlated features. We describe methods for considering heterogeneity — observations weighting and estimating models on subsamples. We define a weighting function via Mahalanobis distance in the space of features and show its predictive properties on following methods: ordinary least squares regression, elastic net, support vector regression, and random forest.
Patent foramen ovale (PFO) is an important cause of embolic cryptogenic stroke (ECS) in young patients. The main mechanism in this case is paradoxical embolism (PE), the basis for which is a right-to-left (R-L) shunt. Objective: to comparatively characterize patients who have undergone ECS, with and without an R-L shunt, as evidenced by transcranial Doppler with the bubble test (TCD-BT).
Processing of mathematical operations and solving numerical tasks implicate a distributed set of brain regions. These regions include the superior and inferior parietal lobules that underlie numerical processing such as size judgments, and additional prefrontal regions that are needed for formal mathematical operations such as addition, subtraction and multiplication [Arsalidou, Taylor, 2011]. Critically, little is known about the connectivity between these regions and the association between math performance and the anatomical structure of white matter tracts. The present study investigates connectivity and white matter tracks associated with networks related to math performance: arcuate fasciculus (AF) and superior longitudinal fasciculus (SLF). Participants performed a computerized task with mathematical operations (addition, subtraction, multiplication, and division) with three levels of difficulty; accuracy and reaction time were recorded. Diffusion tensor imagining (DTI) recordings provided indices on fractional anisotropy (FA) — a measure of the direction of white matter tracks in the brain. The relation between FA and math performance scores is reported.
This paper is an empirical study of the changing nature of the dependence of fundamental factors on the stock market index, which is the trend identified earlier in the Russian stock market. We empirically test the impact of daily values of fundamental factors on the MOEX Russia Index from 2003 to 2018. The analysis of the ARIMA-GARCH (1,1) model with a rolling window reveals that the change in the power and direction of the influence of the fundamental factors on the Russian stock market persists. The Quandt-Andrews breakpoint test and Bai-Perron test identify the number and likely location of structural breaks. We find multiple breaks probably associated with the dramatic falls of the stock market index. The results of the regression models over the different regimes, defined by the structural breaks, can vary markedly over time. This research is of value in macroeconomic forecasting and in the investment strategy development
In this paper, we analyze a new approach for demand prediction in retail. One of the signicant gaps in demand prediction by machine learning methods is the unaccounted sales data censorship. Econometric approaches to modeling censored demand are used to obtain consistent and unbiased estimates of parameters. These approaches can also be transferred to different classes of machine learning models to reduce the prediction error of sales volume. In this study we build two ensemble models to predict demand with and without demand censorship, aggregating predictions for machine learning methods such as Linear regression, Ridge regression, LASSO and Random forest. Having estimated the predictive properties of both models, we test the best predictive power of the models with accounting for the censored nature of demand.
Proceedings of the Fifth Workshop on Experimental Economics and Machine Learning at the National Research Univeristy Higher School of Economics co-located with the Seventh International Conference on Applied Research in Economics (iCare7)
Today, treatment effect estimation at the individual level is a vital problem in many areas of science and business. For example, in marketing, estimates of the treatment effect are used to select the most efficient promo-mechanics; in medicine, individual treatment effects are used to determine the optimal dose of medication for each patient and so on. At the same time, the question on choosing the best method, i.e., the method that ensures the smallest predictive error (for instance, RMSE) or the highest total (average) value of the effect, remains open. Accordingly, in this paper we compare the effectiveness of machine learning methods for estimation of individual treatment effects. The comparison is performed on the Criteo Uplift Modeling Dataset. In this paper we show that the combination of the Logistic Regression method and the Difference Score method as well as Uplift Random Forest method provide the best correctness of Individual Treatment Effect prediction on the top 30% observations of the test dataset.
This paper investigates the change in the behavior of prices and volumes on the Russian electricity market, caused by changes in the electrical grid. The analysis is performed on two previously unconnected macro regions, which currently have a “fictional” interregional transmission line. We use a dataset with economic variables together with the flow frequency as a technical variable of the electrical grid and prove that the latter matters. Our estimates indicate that given the existence of the interregional link, prices in the regions converge to some extent and generation volumes in the regions are shaped by its regions' and the adjacent regions' load. In the future research, the whole electrical grid of Russia should be taken into account.
Trial-to-trial variability of the motor evoked potentials (MEP) to transcranial magnetic stimulation (TMS) is a well-known phenomenon. However, the relationship between the fluctuations of the different types of the motor output and other motor system parameters such as corticospinal excitability, interhemispheric inhibition (IHI) and their interhemispheric asymmetry have not yet been fully investigated. We studied 20 young healthy right-handed volunteers. Four TMS sessions were performed (two single-pulse TMS and two paired-coil TMS with IHI paradigm sessions for each hemisphere), 70 stimuli were delivered during every session. Coefficient of quartile variation (CQV) was used to quantify trail-to-trail variability of MEPs amplitude. Resting motor threshold values were correlated between hemispheres (r = .842, p < .001). IHI phenomenon from the left hemisphere was obtained in 18 out of 20 volunteers, while IHI phenomena from the right hemisphere was shown in 16 out of 20. A strong correlation between the variability of MEP‘s amplitudes during IHI paradigm and the degree of IHI was found for the left hand (r = −.718, p < .001). We also observed a strong correlation between CQV of MEPs from both hands to single-pulse TMS (r = .632, p = .004). A side-specific correlation between the variability of the responses to single-pulse and paired-coil TMS was found for the dominant hemisphere (r = .524, p = .021). Our preliminary results demonstrate the importance of the trial-to-trial variability of the MEPs and its interhemispheric specificity as a defining characteristics of the motor system. This study was partially supported by ofi-m RFBR grant 17-29-02518, by HSE Basic Research Program and Russian Academic Excellence Project ‘5-100’.
Russian energy system is one of the largest among centralized ones in the world. For electricity consumption planning it is essential to consider heterogeneity of energy system areas in terms of both consumption structure and climate conditions. In this paper, the relationship of electricity consumption and air temperature is investigated on the data for 64 Russian regions. Hierarchical and non-hierarchical cluster analysis is employed to form homogeneous groups of regions: three temperature clusters are retrieved. Bearing in mind different electricity needs in regions, piecewise regression with endogenous reference temperature is estimated for each temperature cluster. In all clusters both cooling and heating effects are clearly observable but reference temperature differs. For the clusters of hot and middle climate regions cooling effect prevails, while heating effect dominates in the cold regions cluster. These effects consideration in energy consumption planning may result in a higher quality of forecasting. This is of a great importance for wholesale electricity market agents and functioning of Russian energy system as a whole.
Fractional anisotropy (FA) estimated using diffusion tensor magnetic resonance imaging (dMRI) is considered as a promising biomarker in ischemic stroke (IS). The basis of this study is the assumption that the assessment of FA indices for different white matter tracts will be able to predict the main aspects of the rehabilitation potential even without determining the structural and functional bases of these influences.
Objective: to study the diagnostic significance of changes in FA indices to assess various aspects of the rehabilitation potential in acute IS.
Patients and methods. Examinations were made in 100 patients with IS and in 10 individuals without stroke and cognitive impairment. All the patients underwent dMRI and assessments of rehabilitation potential indicators on days 3 and 10 of the disease and at discharge.
Results and discussion. The indices of FA of the ipsilateral upper longitudinal and cingulum bundles, FA and the size of an infarct focus, asymmetry of FA of the cingulum bundle (rFA), corticospinal tract (at the level of the knee of the internal capsule and bridge) and the anterior limb of the internal capsule, as well as the FA of the splenium and knee of the internal capsule of the intact hemisphere are of the most value for the functional outcome of acute IS. The microstructure of these zones determines the state of most rehabilitation domains. With respect to global outcome, the integrity of the associative tracts of the affected hemisphere is more valuable than the microstructure of the intact hemisphere and rFA. The tracts of the intact hemisphere are of particular importance for the restoration of complex rehabilitation spheres, such as cognitive status and daily living and social skills, which is necessary to ensure patient independence.
Conclusion. The FA indices of the tracts under study seem to be a clinically acceptable biomarker of various aspects of the rehabilitation potential in acute IS.