Previous studies have used the cue-target paradigm to study the effect of the endogenous temporal attention on only visual or auditory stimuli. Furthermore, some studies found that the visual and auditory stimuli is not processed in isolation but produce coherent cognition in the brain when the visual and auditory stimuli are simultaneously presented. However, the effect of endogenous temporal attention on audiovisual (AV) stimuli processing is unclear. Utilizing the high temporal resolution of event-related potentials (ERPs), we used a central cue that can predict the time point (600 ms or 1800 ms) of audiovisual target to investigate whether endogenous temporal attention could modulate AV stimuli processing. The results showed that the endogenous temporal attention could not change the amplitude of the early ERP component in conditions of either short (600 ms) or long (1800 ms) cue-target intervals, indicating that the endogenous temporal attention had no effect on the early-stage of AV stimuli processing. However, the late ERP component showed differences between the short (600 ms) and long (1800 ms) cue-target intervals, supporting a model in which endogenous temporal attention might determine the late stage of AV stimuli processing.