A higher-order function is a function that operates on other functions, either by taking them as arguments, by returning them as results, or both. “Structure and Interpretation of Computer Programs” states the idea plainly: “Procedures that manipulate procedures are called higher-order procedures.” The same concept applies in any language where functions can be passed around as values.
SICP explains why this matters. To build powerful abstractions a language needs procedures that can accept other procedures as arguments or return procedures as values, because many useful computations share a common shape that differs only in some plugged-in operation. Capturing that shape once, with the varying part supplied as a function argument, turns a family of similar procedures into a single general one.
The most familiar higher-order functions are map, filter, and reduce. Map applies a given function to every element of a collection; filter keeps the elements for which a given predicate returns true; reduce combines elements using a supplied two-argument function. In each case the loop structure is written once and the caller supplies the behavior.
Higher-order functions are the key to abstraction and reuse in functional programming. Function composition, which builds a new function by feeding the output of one into the input of another, is itself a higher-order operation. By treating functions as ordinary data, a program can express general patterns directly rather than repeating nearly identical code.