¿Cómo reemplazo Quantstrat 'for loop' con mclapply [paralelizado]?

Aug 16 2020

Me gustaría paralelizar quantstrat. Mi código no es exactamente así, pero esto muestra el problema. Creo que el problema es que .blotter env se inicializa en una dirección de memoria de puntero y no puedo inicializar una matriz/matriz de new.env().

Lo que me gustaría hacer es reemplazar el ciclo for con un mclapply para poder ejecutar múltiples applyStrategies con fechas/símbolos variables (aquí solo se muestran símbolos variables). Mi objetivo final es un clúster de beowulf (makeCluster) y planeo ejecutarlos en paralelo usando hasta 252 días hábiles (ventana móvil) con diferentes símbolos por iteración (pero no necesito todo eso. Simplemente estoy preguntando si hay un manera de trabajar con la asignación de cartera y el subsiguiente objeto de memoria .blotter de tal manera que pueda usar mclapply)

#Load quantstrat in your R environment.

rm(list = ls())

local()

library(quantstrat) 
library(parallel)

# The search command lists all attached packages.
search()

symbolstring1 <- c('QQQ','GOOG')
#symbolstring <- c('QQQ','GOOG')

#for(i in 1:length(symbolstring1))
  mlapply(symbolstring1, function(symbolstring)
{
  #local()
  #i=2
  #symbolstring=as.character(symbolstring1[i])
  
  .blotter <- new.env()
  .strategy <- new.env()
  
  try(rm.strat(strategyName),silent=TRUE)
  try(rm(envir=FinancialInstrument:::.instrument),silent=TRUE)
  for (name in ls(FinancialInstrument:::.instrument)){rm_instruments(name,keep.currencies = FALSE)}
  print(symbolstring)

currency('USD')

stock(symbolstring,currency='USD',multiplier=1)

# Currency and trading instrument objects stored in the 
# .instrument environment

print("FI")
ls(envir=FinancialInstrument:::.instrument)

# blotter functions used for instrument initialization 
# quantstrat creates a private storage area called .strategy

ls(all=T)

# The initDate should be lower than the startDate. The initDate will be used later while initializing the strategy.

initDate <- '2010-01-01'

startDate <- '2011-01-01'

endDate <- '2019-08-10'

init_equity <- 50000

# Set UTC TIME

Sys.setenv(TZ="UTC")

getSymbols(symbolstring,from=startDate,to=endDate,adjust=TRUE,src='yahoo')

# Define names for portfolio, account and strategy. 

#portfolioName <- accountName <- strategyName <- "FirstPortfolio"
portfolioName <- accountName <- strategyName <- paste0("FirstPortfolio",symbolstring)

print(portfolioName)
# The function rm.strat removes any strategy, portfolio, account, or order book object with the given name. This is important

#rm.strat(strategyName)

print("port")
initPortf(name = portfolioName,
          symbols = symbolstring,
          initDate = initDate)

initAcct(name = accountName,
         portfolios = portfolioName,
         initDate = initDate,
         initEq = init_equity)

initOrders(portfolio = portfolioName,
           symbols = symbolstring,
           initDate = initDate)



# name: the string name of the strategy

# assets: optional list of assets to apply the strategy to.  

# Normally these are defined in the portfolio object

# contstrains: optional portfolio constraints

# store: can be True or False. If True store the strategy in the environment. Default is False
print("strat")
strategy(strategyName, store = TRUE)

ls(all=T)

# .blotter holds the portfolio and account object 

ls(.blotter)

# .strategy holds the orderbook and strategy object

print(ls(.strategy))

print("ind")
add.indicator(strategy = strategyName, 
              name = "EMA", 
              arguments = list(x = quote(Cl(mktdata)), 
                               n = 10), label = "nFast")

add.indicator(strategy = strategyName, 
              name = "EMA", 
              arguments = list(x = quote(Cl(mktdata)), 
                               n = 30), 
              label = "nSlow")

# Add long signal when the fast EMA crosses over slow EMA.

print("sig")
add.signal(strategy = strategyName,
           name="sigCrossover",
           arguments = list(columns = c("nFast", "nSlow"),
                            relationship = "gte"),
           label = "longSignal")

# Add short signal when the fast EMA goes below slow EMA.

add.signal(strategy = strategyName, 
           name = "sigCrossover",
           arguments = list(columns = c("nFast", "nSlow"),
                            relationship = "lt"),
           label = "shortSignal")

# go long when 10-period EMA (nFast) >= 30-period EMA (nSlow)

print("rul")
add.rule(strategyName,
         name= "ruleSignal",
         arguments=list(sigcol="longSignal",
                        sigval=TRUE,
                        orderqty=100,
                        ordertype="market",
                        orderside="long",
                        replace = TRUE, 
                        TxnFees = -10),
         type="enter",
         label="EnterLong") 

# go short when 10-period EMA (nFast) < 30-period EMA (nSlow)

add.rule(strategyName, 
         name = "ruleSignal", 
         arguments = list(sigcol = "shortSignal", 
                          sigval = TRUE, 
                          orderside = "short", 
                          ordertype = "market", 
                          orderqty = -100, 
                          TxnFees = -10,                     
                          replace = TRUE), 
         type = "enter", 
         label = "EnterShort")

# Close long positions when the shortSignal column is True

add.rule(strategyName, 
         name = "ruleSignal", 
         arguments = list(sigcol = "shortSignal", 
                          sigval = TRUE, 
                          orderside = "long", 
                          ordertype = "market", 
                          orderqty = "all", 
                          TxnFees = -10, 
                          replace = TRUE), 
         type = "exit", 
         label = "ExitLong")

# Close Short positions when the longSignal column is True

add.rule(strategyName, 
         name = "ruleSignal", 
         arguments = list(sigcol = "longSignal", 
                          sigval = TRUE, 
                          orderside = "short", 
                          ordertype = "market", 
                          orderqty = "all", 
                          TxnFees = -10, 
                          replace = TRUE), 
         type = "exit", 
         label = "ExitShort")

print("summary")
summary(getStrategy(strategyName))

# Summary results are produced below

print("results")
results <- applyStrategy(strategy= strategyName, portfolios = portfolioName,symbols=symbolstring)

# The applyStrategy() outputs all transactions(from the oldest to recent transactions)that the strategy sends. The first few rows of the applyStrategy() output are shown below

getTxns(Portfolio=portfolioName, Symbol=symbolstring)

mktdata

updatePortf(portfolioName)

dateRange <- time(getPortfolio(portfolioName)$summary)[-1]

updateAcct(portfolioName,dateRange)

updateEndEq(accountName)

print(plot(tail(getAccount(portfolioName)$summary$End.Eq,-1), main = "Portfolio Equity"))

#cleanup
for (name in symbolstring) rm(list = name)
#rm(.blotter)
rm(.stoploss)
rm(.txnfees)
#rm(.strategy)
rm(symbols)

}
)

Pero se arroja un error Error en get (símbolo, envir = envir): objeto 'QQQ' no encontrado

Específicamente, el problema es FinancialInstrument:::.instrument apunta a una dirección de memoria que no se actualiza con mis llamadas a variables encapsuladas (símbolo de cadena)

Respuestas

3 BrianG.Peterson Aug 17 2020 at 20:37

apply.paramsetya quantstratusa una foreachconstrucción para paralelizar la ejecución de applyStrategy.

apply.paramsetnecesita hacer una buena cantidad de trabajo para asegurarse de que los entornos estén disponibles en los trabajadores para realizar el trabajo y recopilar los resultados adecuados para enviarlos de vuelta al proceso de llamada.

Lo más simple que puede hacer probablemente sea usar apply.paramset. Cree sus parámetros de fechas y símbolos, y haga que la función se ejecute normalmente.

Alternativamente, le sugiero que mire los pasos necesarios para usar una foreachconstrucción paralela apply.paramsetpara modificarla según su caso sugerido.

También tenga en cuenta que su pregunta se refiere al uso de un clúster Beowulf y mclapply. Esto no funcionará. mclapplysolo funciona en un solo espacio de memoria. Los clústeres de Beowulf normalmente no comparten un solo espacio de memoria y proceso. Por lo general, distribuyen trabajos a través de bibliotecas paralelas como MPI. apply.paramsetya podría distribuir en un clúster de Beowulf mediante el uso de un doMPIbackend para foreach. Esa es una de las razones que usamos foreach: la multitud de backends paralelos diferentes que están disponibles. El doMCbackend para foreachrealmente se usa mclapplydetrás de escena.