Today-era functional languages (so basically haskell and scala) don't do that at the "host PL" level, but you can model it in-band, and modeling embedded PLs is what most of the cool FP research for the last 5 years is all about. Free monad, tagless-final interpreters, etc. You make a DSL with abstract capabilities and then code your application in that. Any reified behaviors end up encoded into the type. I hear Facebook is using algebraic effects to model and track data access.
FP is relatively convenient for metaprogramming, yes. GADTs and tagless-final encodings (or as I think of them, Church-encoded ASTs) are especially useful for this.
But it wouldn't be a big difference if the host language was heavily imperative, so long as it favors immutable data structures. The FP aspect for this exploration of DSLs is much more social and cultural than technical.
I think that if we really want to decouple AST from evaluation context, the main tool we'd need is to deconflate module 'import' into separate steps to 'load' a module's AST and 'integrate' definitions, allowing for intermediate processing. This requires some host language changes, elimination of module-global state, etc..